A Support Vector Machine (SVM) is a supervised machine learning algorithm for binary classification. It finds the optimal linear decision boundary (hyperplane) that separates two classes with the maximum margin.
Given labelled data: ${(x_1, y_1), \ldots, (x_n, y_n)}$ where $y_i \in {-1, +1}$
The SVM finds a hyperplane separating the two classes:
Classification rule: $\hat{y} = \text{sign}(\mathbf{w} \cdot \mathbf{x} + b)$
The margin is the perpendicular distance between the decision boundary and the nearest training points from each class.
$$\text{Margin} = \frac{2}{|\mathbf{w}|}$$
A wider margin generally leads to better generalisation.
Support vectors are the training points lying exactly on the margin boundaries:
$$y_i(\mathbf{w} \cdot \mathbf{x}_i + b) = 1$$
These are the only training points that determine the decision boundary. All other points can be removed without changing the classifier.
SVM solves:
$$\text{minimise} \;\frac{1}{2}|\mathbf{w}|^2 \quad \text{subject to} \quad y_i(\mathbf{w} \cdot \mathbf{x}_i + b) \geq 1 \;\; \forall i$$
Minimising $|\mathbf{w}|^2$ maximises the margin $\frac{2}{|\mathbf{w}|}$.
KEY TAKEAWAY: The SVM finds the hyperplane that maximises the margin between the two classes. The support vectors are the only training points that define the boundary — they are the most informative data points.
Not all data is linearly separable. Solutions:
- Soft-margin SVM: Allow some misclassifications (controlled by hyperparameter $C$).
- Kernel trick: Map data to higher dimensions where linear separation is possible.
| Concept | Description |
|---|---|
| Hyperplane | Decision boundary |
| Margin | Width of the gap between classes |
| Support vectors | Points on the margin boundary |
| Maximise margin | SVM’s training objective |
| $|\mathbf{w}|^2$ | Minimised to maximise margin |
EXAM TIP: For VCAA, understand the geometric concepts: margin, support vectors, decision boundary. Know the optimisation goal (maximise margin = minimise $|\mathbf{w}|^2$). You do not need to solve the full quadratic programming problem.
COMMON MISTAKE: The SVM is defined by the support vectors only, not all training points. If you removed all non-support-vector training data, the decision boundary would not change.
VCAA FOCUS: Know the definition of margin, the role of support vectors, and why maximising the margin leads to good generalisation. Apply SVM concepts geometrically to 1D and 2D data.