Statistical Inference

Statistical inference uses data from a sample to draw conclusions about a population. It acknowledges inherent uncertainty through probability.

Key Concepts

Term	Definition
Population	The entire group under study
Sample	A subset of the population
Parameter	A numerical summary of the population (e.g., $\mu$, $p$)
Statistic	A numerical summary of the sample (e.g., $\bar{x}$, $\hat{p}$)
Estimator	A rule/formula for computing an estimate from data
Estimate	The specific value produced by applying the estimator to a sample

Sampling Distribution

The sampling distribution of a statistic is the probability distribution of that statistic over all possible samples of a given size $n$.

Central Limit Theorem (CLT): For large $n$, the sample mean $\bar{X}$ is approximately normally distributed regardless of the population distribution:
$$\bar{X} \sim N!\left(\mu,\, \frac{\sigma^2}{n}\right) \quad \text{approximately, for large } n$$

The standard error of $\bar{X}$ is $\text{SE} = \dfrac{\sigma}{\sqrt{n}}$.

Confidence Intervals

A confidence interval (CI) is a range of values that is likely to contain the true parameter.

For a population mean with known $\sigma$:
$$\bar{x} \pm z^* \cdot \frac{\sigma}{\sqrt{n}}$$

Confidence level	$z^*$
90%	1.645
95%	1.960
99%	2.576

Interpretation: If this procedure is repeated many times, approximately 95% of CIs constructed will contain the true $\mu$. It does NOT mean there is a 95% probability $\mu$ is in this specific interval.

Hypothesis Testing

A hypothesis test assesses evidence against a null hypothesis $H_0$.

$H_0$: the null hypothesis (“no effect”, default position)
$H_1$: the alternative hypothesis
Significance level $\alpha$: threshold for rejecting $H_0$ (commonly 0.05)
p-value: probability of observing a result at least as extreme as the data, assuming $H_0$ is true

Decision rule: Reject $H_0$ if $\text{p-value} < \alpha$.

Test statistic for $H_0: \mu = \mu_0$:
$$z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$$

KEY TAKEAWAY: Inference moves from sample data to population conclusions. A confidence interval estimates a parameter; a hypothesis test assesses whether data is consistent with a specific claim.

EXAM TIP: When interpreting a confidence interval, always refer to the specific context: “We are 95% confident that the true mean wait time is between 4.2 and 6.8 minutes.”

COMMON MISTAKE: Interpreting the p-value as the probability that $H_0$ is true. The p-value is a probability about the data given $H_0$, not about $H_0$ given the data.

Statistical Inference

Table of Contents

About these notes

Join StudyPulse