Some one-dimensional datasets are not linearly separable — no single threshold can separate the two classes. The solution is to create a second feature from the original data, mapping the points into two dimensions where they are linearly separable.
Example:
- Class $+1$: ${-3, 3}$ (far from zero)
- Class $-1$: ${-1, 0, 1}$ (close to zero)
No single threshold $t$ works: for any $t$, some class $+1$ points are on the wrong side.
Create a second feature $x_2 = x_1^2$ (the square of the original feature), and map each point $(x_1) \rightarrow (x_1, x_1^2)$:
| $x_1$ | Class | $x_2 = x_1^2$ | 2D point |
|---|---|---|---|
| $-3$ | $+1$ | $9$ | $(-3, 9)$ |
| $3$ | $+1$ | $9$ | $(3, 9)$ |
| $-1$ | $-1$ | $1$ | $(-1, 1)$ |
| $0$ | $-1$ | $0$ | $(0, 0)$ |
| $1$ | $-1$ | $1$ | $(1, 1)$ |
In 2D: class $+1$ points have large $x_2$ (high up), class $-1$ points have small $x_2$ (low down). A horizontal line $x_2 = c$ separates them.
Support vectors (closest across classes): $(3, 9)$ or $(-3, 9)$ from class $+1$, and $(1, 1)$ or $(-1, 1)$ from class $-1$.
Decision boundary $x_2 = 5$ means $x_1^2 = 5$, so $|x_1| = \sqrt{5} \approx 2.24$.
Classification rule in original 1D:
- If $|x_1| > \sqrt{5}$: predict $+1$
- If $|x_1| \leq \sqrt{5}$: predict $-1$
This is a non-linear decision boundary in 1D, found using a linear SVM in 2D.
KEY TAKEAWAY: Mapping from 1D to 2D with $x_2 = x_1^2$ transforms a non-linearly separable problem into a linearly separable one. The linear boundary in 2D corresponds to a non-linear boundary in 1D.
This technique generalises to the kernel trick: map features to a higher-dimensional space where linear separation is possible. The decision boundary in the original space may be curved (parabola, circle, etc.).
| Transformation | New feature | When useful |
|---|---|---|
| Quadratic | $x_2 = x_1^2$ | Data symmetric around zero, classes at different distances from 0 |
| Absolute value | $x_2 = | x_1 |
| Exponential | $x_2 = e^{x_1}$ | Exponential growth patterns |
EXAM TIP: For VCAA: given a 1D dataset that is not linearly separable, apply $x_2 = x_1^2$ to create 2D data, find the SVM boundary in 2D (midpoint between support vectors), then interpret back in 1D.
COMMON MISTAKE: After finding the decision boundary in 2D, you must transform it back to the original 1D space. If the 2D boundary is $x_2 = 5$, the 1D rule is $x_1^2 = 5$, not $x_1 = 5$.
VCAA FOCUS: Know why $x_2 = x_1^2$ is useful, how to apply the transformation, find the 2D boundary, and interpret it in 1D.