Bivariate data involves two variables measured on the same individual or case. The goal is to investigate whether a relationship (association) exists between the two variables and, if so, to describe and model it.
| Univariate | Bivariate | |
|---|---|---|
| Variables | One | Two |
| Purpose | Describe distribution | Examine association |
| Display | Histogram, boxplot, dot plot | Scatterplot |
| Summary | Mean, median, IQR | Correlation coefficient, regression line |
When one variable may cause or explain the other:
Example: Hours of study (x) vs exam score (y). Study hours explains exam score.
If no causal direction is assumed, either can go on either axis.
| Explanatory variable | Response variable | Display |
|---|---|---|
| Categorical | Numerical | Parallel boxplots, back-to-back stem plot |
| Categorical | Categorical | Two-way frequency table, segmented bar chart |
| Numerical | Numerical | Scatterplot |
KEY TAKEAWAY: For numerical vs numerical bivariate data, always start with a scatterplot to visually assess the association before calculating statistics.
When examining bivariate numerical data, answer:
Bivariate analysis underlies:
- Prediction: If we know x, can we estimate y?
- Causation vs correlation: Association ≠ causation
- Regression analysis: Finding the line of best fit
- Decision making: Evidence-based conclusions
VCAA FOCUS: The bivariate data section of the exam typically involves scatterplots, correlation coefficients, and the least squares regression line. Understanding the conceptual framework (explanatory/response variables, association vs causation) is essential before tackling calculations.
REMEMBER: Correlation does not imply causation. Even a very strong association between two variables does not prove that one causes the other — there may be a lurking variable.