Once you have the regression equation $\hat{y} = a + bx$, substitute a value of $x$ to predict $y$:
$$\hat{y} = a + b \times x_{\text{new}}$$
Example: Equation: $\widehat{\text{score}} = 31.5 + 9.2 \times \text{hours}$
Predict score for 5 hours: $\hat{y} = 31.5 + 9.2(5) = 77.5$ marks.
| Interpolation | Extrapolation | |
|---|---|---|
| Definition | Predicting within the range of the observed data | Predicting outside the range of the observed data |
| Reliability | Generally reliable | Potentially unreliable |
| Example | Data ranges 1–8 hours; predict for 4 hours | Data ranges 1–8 hours; predict for 15 hours |
The linear relationship observed within the data range may not continue beyond it. The true relationship may:
- Level off (reach a maximum/minimum)
- Change direction
- Follow a curve
Example: Predicting exam scores for 20 hours of study using $\hat{y} = 31.5 + 9.2x$ gives $\hat{y} = 215.5$ — clearly impossible for a test out of 100. This shows the danger of extrapolation.
Context: A study finds $\widehat{\text{fuel used}} = 2.3 + 0.08 \times \text{distance}$ (litres, km), based on data for distances 10–200 km.
| Prediction | Type | Reliable? |
|---|---|---|
| Distance = 50 km → 6.3 L | Interpolation | Yes |
| Distance = 150 km → 14.3 L | Interpolation | Yes |
| Distance = 500 km → 42.3 L | Extrapolation | Questionable |
| Distance = 5 km → 2.7 L | Extrapolation | Questionable |
KEY TAKEAWAY: Interpolation (within data range) is generally reliable. Extrapolation (outside data range) is unreliable — always identify which type of prediction you are making.
EXAM TIP: VCAA often gives a prediction scenario and asks whether it is interpolation or extrapolation, and to comment on its reliability. Always state the data range and whether the x-value falls inside or outside it.
VCAA FOCUS: The word “limitations” in the KK specifically invites discussion of extrapolation, weak association, non-linearity, and causation. Cover all relevant limitations in your answer.