Central Limit Theorem — Section 1: Probability Foundations

The Central Limit Theorem says: for i.i.d. random variables $X_1, X_2, \dots, X_n$ with finite mean $\mu$ and variance $\sigma^2$ , the standardized sum

Z_n = \frac{\bar{X}_n - \mu}{\sigma / \sqrt{n}}

converges in distribution to a standard normal $N(0, 1)$ as $n \to \infty$ . This holds regardless of the original distribution of the $X_i$ , as long as the mean and variance exist.

Why it matters

Many phenomena are sums of many small independent effects — measurement errors, biological traits, financial returns. The CLT says their distributions tend toward normal even if the underlying mechanism is wildly non-normal. This is why "assume Gaussian noise" is so often a defensible default.

How fast does it converge

The Berry-Esseen theorem gives a bound: $|F_{Z_n}(z) - \Phi(z)| \leq C \cdot \rho / (\sigma^3 \sqrt{n})$ , where $\rho = E[|X|^3]$ . Convergence is faster for more-symmetric, lighter-tailed underlying distributions; slower for heavy tails or strong skew. As a rule of thumb, $n = 30$ is "enough" for the sample mean to look approximately normal — but tails take longer.

When it fails

The CLT requires finite variance. Distributions with infinite variance (Cauchy, Pareto with shape $\leq 2$ ) violate the assumption — their sample means don't concentrate. Heavy-tailed processes look "almost normal" in moderate samples but fail dramatically in the tails.

Practical implications

Hypothesis tests built on the assumption "the sample mean is normal" (z-test, t-test) work for any underlying distribution provided $n$ is large.
Confidence intervals based on the normal approximation are reliable for $n \geq 30$ when the data isn't too skewed.
For small samples or heavy-tailed data, use the t-distribution or bootstrap methods instead of the normal.