Overfitting and Underfitting - StudyPulse
Boost Your VCE Scores Today with StudyPulse
8000+ Questions AI Tutor Help
Home Subjects Algorithmics (HESS) Overfitting and underfitting

Overfitting and Underfitting

Algorithmics (HESS)
StudyPulse

Overfitting and Underfitting

Algorithmics (HESS)
01 May 2026

Model Overfitting and Underfitting

Two major failure modes in machine learning occur when a model is poorly matched to the complexity of the data.


Underfitting

Definition: The model is too simple to capture the true pattern in the data.

Symptoms:
- High training error
- High test error
- Model performs poorly on both seen and unseen data

Causes: Model too simple (e.g., fitting a straight line to curved data), too few features, or too little training.

Analogy: A student who barely studied and gives vague answers — cannot even get training-set questions right.

# Underfitting: fitting a line to exponential data
training_data = [(1, 2), (2, 4), (3, 8), (4, 16)]
underfitting_model = lambda x: 3 * x + 1  # straight line misses the curve

COMMON MISTAKE: Underfitting is NOT caused by having too much data. It is caused by the model being insufficiently expressive for the pattern in the data.


Overfitting

Definition: The model is so complex it memorises the training data (including noise) rather than learning the general pattern.

Symptoms:
- Very low training error
- High test error — poor generalisation to new data

Causes: Model too complex (too many parameters), training for too many iterations, insufficient training data.

Analogy: A student who memorised every past exam question word-for-word but cannot answer a rephrased question.


The Bias-Variance Trade-off

Concept Meaning Problem when high
Bias Error from overly simple assumptions Underfitting
Variance Sensitivity to fluctuations in training data Overfitting

The goal is a model with low bias and low variance.


Diagnosis

Training Error Test Error
Underfitting High High
Good fit Low Low
Overfitting Very low High

Learning curves: Plot training error and test error vs training set size or epochs. A large gap (training error much lower than test error) indicates overfitting.


Remedies

Problem Remedies
Underfitting Use more complex model, add features, train longer
Overfitting Use simpler model, regularisation, more data, early stopping, dropout (neural nets)

Connection to SVM and Neural Networks

  • SVM: Margin maximisation acts as regularisation, reducing overfitting.
  • Neural networks: More layers and neurons increase capacity and overfitting risk on small datasets.

EXAM TIP: VCAA will ask you to explain consequences of each. Underfitting: model cannot even fit training data. Overfitting: model fails to generalise. Both lead to poor real-world performance but for different reasons.

VCAA FOCUS: Define both terms clearly, state their consequences (training vs test performance), and give at least one cause and one remedy for each.

Table of Contents