Statistical inference is the process of drawing conclusions about a population based on information from a sample. Because collecting data from an entire population is often impractical, samples are used to estimate population parameters.
| Term | Meaning |
|---|---|
| Population | The entire group of interest |
| Sample | A subset of the population |
| Parameter | A numerical characteristic of the population (e.g. $\mu$, $p$) |
| Statistic | A numerical characteristic of the sample (e.g. $\bar{x}$, $\hat{p}$) |
| Sampling variability | Different samples give different statistics |
The population mean $\mu$ is typically unknown. The sample mean $\bar{x}$ is used to estimate it.
If many samples of size $n$ are drawn, their means form a sampling distribution of $\bar{x}$.
Without inference, conclusions from data only apply to the sample. Inference allows researchers, health professionals, businesses, and governments to make evidence-based claims about the broader population.
A survey of 50 Year 12 students finds that 36 spend more than 3 hours per week on study. Can we conclude that the majority of all Year 12 students do so?
The sample proportion $\hat{p} = \dfrac{36}{50} = 0.72$.
This is a point estimate of the true population proportion $p$. Inference tools (confidence intervals, hypothesis tests) are needed to make formal statements about $p$.
KEY TAKEAWAY: Statistical inference bridges the gap between sample data and population conclusions. The quality of inference depends on sample size and the randomness of selection.
VCAA FOCUS: Unit 4 Data Analysis focuses on confidence intervals and hypothesis testing. Both require understanding the sampling distribution concept introduced here.