Feature Creation for SVM

Creating a Second Feature for Linear Classification

Some one-dimensional datasets are not linearly separable — no single threshold can separate the two classes. The solution is to create a second feature from the original data, mapping the points into two dimensions where they are linearly separable.

The Problem: Non-Linearly Separable 1D Data

Example:
- Class $+1$: ${-3, 3}$ (far from zero)
- Class $-1$: ${-1, 0, 1}$ (close to zero)

No single threshold $t$ works: for any $t$, some class $+1$ points are on the wrong side.

The Solution: Add a Quadratic Feature

Create a second feature $x_2 = x_1^2$ (the square of the original feature), and map each point $(x_1) \rightarrow (x_1, x_1^2)$:

$x_1$	Class	$x_2 = x_1^2$	2D point
$-3$	$+1$	$9$	$(-3, 9)$
$3$	$+1$	$9$	$(3, 9)$
$-1$	$-1$	$1$	$(-1, 1)$
$0$	$-1$	$0$	$(0, 0)$
$1$	$-1$	$1$	$(1, 1)$

In 2D: class $+1$ points have large $x_2$ (high up), class $-1$ points have small $x_2$ (low down). A horizontal line $x_2 = c$ separates them.

Finding the Decision Boundary in 2D

Support vectors (closest across classes): $(3, 9)$ or $(-3, 9)$ from class $+1$, and $(1, 1)$ or $(-1, 1)$ from class $-1$.

Margin boundaries: $x_2 = 9$ and $x_2 = 1$
Decision boundary: $x_2 = 5$ (midpoint)
Margin width: $8$

Interpreting the Result Back in 1D

Decision boundary $x_2 = 5$ means $x_1^2 = 5$, so $|x_1| = \sqrt{5} \approx 2.24$.

Classification rule in original 1D:
- If $|x_1| > \sqrt{5}$: predict $+1$
- If $|x_1| \leq \sqrt{5}$: predict $-1$

This is a non-linear decision boundary in 1D, found using a linear SVM in 2D.

KEY TAKEAWAY: Mapping from 1D to 2D with $x_2 = x_1^2$ transforms a non-linearly separable problem into a linearly separable one. The linear boundary in 2D corresponds to a non-linear boundary in 1D.

General Principle

This technique generalises to the kernel trick: map features to a higher-dimensional space where linear separation is possible. The decision boundary in the original space may be curved (parabola, circle, etc.).

Common Feature Transformations

Transformation	New feature	When useful
Quadratic	$x_2 = x_1^2$	Data symmetric around zero, classes at different distances from 0
Absolute value	$x_2 =	x_1
Exponential	$x_2 = e^{x_1}$	Exponential growth patterns

EXAM TIP: For VCAA: given a 1D dataset that is not linearly separable, apply $x_2 = x_1^2$ to create 2D data, find the SVM boundary in 2D (midpoint between support vectors), then interpret back in 1D.

COMMON MISTAKE: After finding the decision boundary in 2D, you must transform it back to the original 1D space. If the 2D boundary is $x_2 = 5$, the 1D rule is $x_1^2 = 5$, not $x_1 = 5$.

VCAA FOCUS: Know why $x_2 = x_1^2$ is useful, how to apply the transformation, find the 2D boundary, and interpret it in 1D.

Feature Creation for SVM

Table of Contents

About these notes

Join StudyPulse