The five-number summary describes a dataset using five key values:
| Value | Symbol | Meaning |
|---|---|---|
| Minimum | Min | Smallest value (excluding outliers) |
| Lower quartile | $Q_1$ | 25th percentile |
| Median | $M$ or $Q_2$ | 50th percentile |
| Upper quartile | $Q_3$ | 75th percentile |
| Maximum | Max | Largest value (excluding outliers) |
Before drawing a boxplot, test for outliers using the fence method:
$$\text{Lower fence} = Q_1 - 1.5 \times \text{IQR}$$
$$\text{Upper fence} = Q_3 + 1.5 \times \text{IQR}$$
Any value outside these fences is an outlier and is plotted as a separate dot (×) on the boxplot.
A boxplot (box-and-whisker plot) is drawn on a number line:
|-------
* | box | *
--|-----|-------|-------|-------->
Min Q1 Median Q3 Max
(whisker) (whisker)
Data: 4, 7, 8, 9, 11, 12, 13, 15, 16, 25
Step 1: Sort and find five-number summary
- Min = 4, $Q_1$ = 8, Median = 11.5, $Q_3$ = 15, Max = 25
Step 2: Calculate IQR and fences
- IQR = \$15 - 8 = 7$
- Lower fence = \$8 - 1.5(7) = 8 - 10.5 = -2.5$
- Upper fence = \$15 + 1.5(7) = 15 + 10.5 = 25.5$
Step 3: Check for outliers
- All values lie within $[-2.5, 25.5]$, so 25 is not an outlier (it is exactly within the fence)
Boxplot description: Box from 8 to 15, median line at 11.5, left whisker to 4, right whisker to 25.
| Feature | What it indicates |
|---|---|
| Median near centre of box | Symmetric distribution |
| Median closer to $Q_1$ | Positively skewed |
| Median closer to $Q_3$ | Negatively skewed |
| Long right whisker | Positive skew / large upper values |
| Long left whisker | Negative skew / large lower values |
| Outlier points (×) | Unusually extreme values |
When two boxplots are drawn on the same scale (side-by-side), compare:
1. Centre (medians) — which group has higher typical values?
2. Spread (IQR, range) — which group is more variable?
3. Shape (skew, symmetry)
4. Outliers — which group has more extreme values?
KEY TAKEAWAY: The box represents the middle 50% of data. Wider boxes = more spread. The position of the median line within the box reveals skew.
EXAM TIP: VCAA often asks you to compare two distributions from boxplots. Address centre, spread, AND shape in your response, using actual values from the plot.
COMMON MISTAKE: Drawing whiskers to the min/max regardless of outliers. Always check fences first — whiskers only extend to the most extreme non-outlier value.