An outlier is a data value that is unusually far from the rest of the data.
Formal definition (using fences):
$$\text{Value is an outlier if } x < Q_1 - 1.5 \times \text{IQR} \quad \text{or} \quad x > Q_3 + 1.5 \times \text{IQR}$$
| Statistic | Effect of Outlier | Resistant? |
|---|---|---|
| Mean | Pulled strongly toward the outlier | No — sensitive |
| Median | Barely changes (shifts by at most 0.5 positions) | Yes — resistant |
| Mode | Unaffected (outliers are usually unique values) | Yes — resistant |
Original data: 10, 12, 13, 14, 15, 16, 17
After adding outlier 52: 10, 12, 13, 14, 15, 16, 17, 52
KEY TAKEAWAY: An outlier drags the mean toward it but has minimal effect on the median. This is why median is preferred for skewed data.
| Statistic | Effect of Outlier | Resistant? |
|---|---|---|
| Range | Greatly increased | No — very sensitive |
| IQR | Unchanged (outlier is beyond the fences) | Yes — resistant |
| Standard deviation | Greatly increased (since $x - \bar{x}$ is huge for the outlier) | No — sensitive |
EXAM TIP: VCAA commonly asks: “What effect does the outlier have on the mean and median?” Always state the direction (increases/decreases) and explain why. Then identify which statistic is more appropriate.
VCAA FOCUS: Never just say “remove the outlier” without justification. In a VCAA context, you must investigate and explain — outliers often represent real and important data points (e.g. one student with a very high income in a household survey).