Two distributions can share the same mean but have very different spreads. Measures of variability quantify how concentrated or dispersed the data is.
Range
Max minus min. Simple but extremely sensitive to outliers — a single freak value dominates. Rarely used as a serious summary.
Interquartile range (IQR)
, the spread of the middle 50%. Robust to outliers; commonly displayed in box plots. Useful for data with long tails where you care about the bulk.
Variance and standard deviation
(or for Bessel's correction in samples). The standard deviation is in the same units as the data — more interpretable.
For approximately normal data, ~68% of values lie within 1 of the mean, ~95% within 2, ~99.7% within 3 (the "68-95-99.7 rule").
Mean absolute deviation (MAD)
. More robust to outliers than variance because it doesn't square the residuals. Less mathematically convenient (no closed-form gradients) which is why variance dominates in practice.
Coefficient of variation
. Dimensionless. Useful for comparing variability across distributions with different scales — "noise level relative to signal."
Why N-1 for sample variance
Dividing by (Bessel's correction) makes the sample variance an unbiased estimator of the population variance. Dividing by gives a biased estimator that systematically underestimates. The difference vanishes as grows — relevant only for small samples.