Statistical Inference - StudyPulse
Boost Your VCE Scores Today with StudyPulse
8000+ Questions AI Tutor Help
Home Subjects Specialist Mathematics Statistical inference

Statistical Inference

Specialist Mathematics
StudyPulse

Statistical Inference

Specialist Mathematics
12 May 2026

Statistical Inference

Statistical inference involves using data from a sample to make conclusions or predictions about an entire population. In Specialist Mathematics, this focuses on the distribution of the sample mean, the construction of confidence intervals, and the formal process of hypothesis testing.


1. Linear Combinations of Random Variables

Before analyzing sample means, it is essential to understand how multiple random variables interact.

If $X$ and $Y$ are independent random variables, and $a$ and $b$ are constants:

  • Expected Value: $E(aX + bY) = aE(X) + bE(Y)$
  • Variance: $Var(aX + bY) = a^2Var(X) + b^2Var(Y)$

For the sum of $n$ independent observations of the same random variable $X$:
* $E(X_1 + X_2 + \dots + X_n) = n\mu$
* $Var(X_1 + X_2 + \dots + X_n) = n\sigma^2$
* $SD(X_1 + X_2 + \dots + X_n) = \sigma\sqrt{n}$

COMMON MISTAKE: Do not confuse $Var(2X)$ with $Var(X_1 + X_2)$.
$Var(2X) = 2^2Var(X) = 4\sigma^2$, whereas $Var(X_1 + X_2) = \sigma^2 + \sigma^2 = 2\sigma^2$. The latter represents the sum of two independent trials, which has less variability than simply doubling a single result.


2. The Sampling Distribution of the Sample Mean

The sample mean, denoted by $\bar{X}$, is a random variable formed by taking the average of $n$ independent observations:
$$\bar{X} = \frac{X_1 + X_2 + \dots + X_n}{n}$$

Properties of the Sample Mean

  • Expected Value: $E(\bar{X}) = \mu$ (The mean of the sample means is the population mean).
  • Variance: $Var(\bar{X}) = \frac{\sigma^2}{n}$
  • Standard Deviation (Standard Error): $SD(\bar{X}) = \frac{\sigma}{\sqrt{n}}$

The Central Limit Theorem (CLT)

The Central Limit Theorem states that for a sufficiently large sample size (usually $n \ge 30$), the sampling distribution of the sample mean $\bar{X}$ will be approximately normally distributed, regardless of the shape of the original population distribution.

Condition Distribution of $\bar{X}$
Population is Normal $\bar{X}$ is exactly Normal: $\bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right)$
Population is NOT Normal ($n \ge 30$) $\bar{X}$ is approximately Normal: $\bar{X} \approx N\left(\mu, \frac{\sigma^2}{n}\right)$

KEY TAKEAWAY: As the sample size $n$ increases, the standard deviation of the sample mean decreases (the distribution becomes narrower), meaning the sample mean becomes a more reliable estimate of the population mean.


3. Confidence Intervals for the Population Mean

A confidence interval (CI) provides a range of plausible values for the population mean $\mu$, based on a sample mean $\bar{x}$.

Formula for the $C\%$ Confidence Interval

For a population with a known standard deviation $\sigma$:
$$\left( \bar{x} - z\frac{\sigma}{\sqrt{n}}, \bar{x} + z\frac{\sigma}{\sqrt{n}} \right)$$

Where $z$ is the critical value for the desired level of confidence.

Common Critical Values ($z$)

Confidence Level $z$ value (approx.)
90% 1.645
95% 1.96
99% 2.576

Margin of Error and Sample Size

The Margin of Error ($M$) is half the width of the confidence interval:
$$M = z\frac{\sigma}{\sqrt{n}}$$

To find the required sample size $n$ for a specific margin of error:
$$n = \left( \frac{z\sigma}{M} \right)^2$$

EXAM TIP: If the population standard deviation $\sigma$ is unknown and the sample size is large ($n \ge 30$), you can use the sample standard deviation $s$ as an estimate for $\sigma$.


4. Hypothesis Testing for the Mean

Hypothesis testing is a formal procedure used to determine whether there is enough evidence in a sample to support a particular claim about the population mean $\mu$.

The Steps of a Hypothesis Test

  1. State the Hypotheses:
    • Null Hypothesis ($H_0$): The status quo; assumes no change (e.g., $H_0: \mu = \mu_0$).
    • Alternative Hypothesis ($H_1$): What we are testing for (e.g., $H_1: \mu \neq \mu_0$, $H_1: \mu > \mu_0$, or $H_1: \mu < \mu_0$).
  2. Identify the Distribution: Assume $H_0$ is true, so $\bar{X} \sim N\left(\mu_0, \frac{\sigma^2}{n}\right)$.
  3. Calculate the Test Statistic ($z^*$):
    $$z^* = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}$$
  4. Determine the p-value: The probability of obtaining a sample mean at least as extreme as the observed value, given $H_0$ is true.
  5. Conclusion: Compare the $p$-value to the significance level ($\alpha$), usually 0.05.
    • If $p < \alpha$: Reject $H_0$ (Significant evidence).
    • If $p \ge \alpha$: Do not reject $H_0$ (Insufficient evidence).

One-tailed vs Two-tailed Tests

  • One-tailed ($H_1: \mu > \mu_0$ or $\mu < \mu_0$): $p = \text{Pr}(Z > z^)$ or $\text{Pr}(Z < z^)$.
  • Two-tailed ($H_1: \mu \neq \mu_0$): $p = 2 \times \text{Pr}(Z > |z^*|)$.

VCAA FOCUS: Always state your conclusion in the context of the original problem. Don’t just say “Reject $H_0$”; say “There is sufficient evidence at the 5% level to suggest that the mean weight of the product has increased.”


5. Errors in Hypothesis Testing

Because we use sample data to make inferences, there is always a chance of reaching the wrong conclusion.

Decision $H_0$ is actually True $H_0$ is actually False
Reject $H_0$ Type I Error Correct Decision
Do not reject $H_0$ Correct Decision Type II Error
  • Type I Error: Rejecting $H_0$ when it is actually true.
    • The probability of a Type I error is equal to the significance level $\alpha$.
  • Type II Error: Failing to reject $H_0$ when it is actually false.
    • To calculate the probability of a Type II error ($\beta$), you must be given a specific alternative value for $\mu$.

REMEMBER: To reduce the probability of both types of errors simultaneously, you must increase the sample size $n$. Decreasing $\alpha$ (the risk of a Type I error) will typically increase $\beta$ (the risk of a Type II error) if the sample size remains constant.

Table of Contents