| Term | Definition |
|---|---|
| Population | The entire group of individuals or objects being studied |
| Sample | A subset of the population selected for study |
| Census | Data collected from every member of the population |
| Sample survey | Data collected from a selected subset (sample) of the population |
| Parameter | A numerical measure describing the population (e.g. population mean $\mu$) |
| Statistic | A numerical measure calculated from a sample (e.g. sample mean $\bar{x}$) |
| Feature | Census | Sample Survey |
|---|---|---|
| Who is measured? | Everyone in the population | A selected subset |
| Accuracy | Exact (no sampling error) | Approximate (subject to sampling error) |
| Cost | Expensive, time-consuming | Cheaper, faster |
| Practicality | Impossible for large/infinite populations | Feasible for most situations |
| Example | ABS national census (every 5 years) | Opinion poll of 1000 voters |
Sampling is preferred when:
- The population is very large (e.g. all Australians)
- Testing is destructive (e.g. testing the lifetime of light bulbs)
- Data is needed quickly
- Cost constraints apply
KEY TAKEAWAY: A well-chosen sample can give reliable estimates of population parameters at a fraction of the cost of a census.
| Method | Description | Advantage |
|---|---|---|
| Simple random sampling | Every member has an equal chance of selection | Unbiased |
| Systematic sampling | Select every $k$th member from a list | Easy to implement |
| Stratified sampling | Divide into subgroups, then randomly sample from each | Ensures representation |
| Cluster sampling | Randomly select entire groups | Practical for geographically spread populations |
Problem: A school of 800 students wants to survey students about their study habits. They randomly select 80 students.
- Population: All 800 students at the school
- Sample: The 80 selected students
- This is a: Sample survey
- Sampling fraction: $\frac{80}{800} = 10\%$
When we calculate a mean from sample data, we write $\bar{x}$ (x-bar). When describing the true population mean, we use $\mu$ (mu). Statistics estimate parameters.
EXAM TIP: Know the difference between a parameter (describes the population) and a statistic (describes the sample). VCAA often asks you to identify which is which in context.
COMMON MISTAKE: Assuming a larger sample is always better. A larger biased sample can be worse than a smaller representative sample. The method of selection matters most.