Bias–Variance Trade-off — Section 19: Machine Learning Fundamentals

Expected prediction error decomposes as

\text{Error} = \text{Bias}^2 + \text{Variance} + \text{Irreducible noise}

Bias measures how far the average prediction is from truth. Variance measures how much predictions wiggle as the training set changes. Simple models have high bias and low variance — they consistently miss the target. Complex models have low bias and high variance — they fit every training set differently.

The trade-off: minimizing total error requires balancing the two. Too simple → underfit (training error and test error both high). Too complex → overfit (training error low, test error high).

Regularization, ensembling, and cross-validation are the main tools for managing the trade-off. In quant prediction, the noise component is enormous — total error is mostly irreducible — so the optimum tends toward simple, high-bias models that resist overfitting.

Expected prediction error decomposes as

\text{Error} = \text{Bias}^2 + \text{Variance} + \text{Irreducible noise}

The trade-off: minimizing total error requires balancing the two. Too simple → underfit (training error and test error both high). Too complex → overfit (training error low, test error high).