Prior, Likelihood, Posterior — Section 10: Bayesian Inference

Bayes' Rule applied to parameters reads

P(\theta \mid x) = \frac{P(x \mid \theta)\, P(\theta)}{P(x)}

In words, posterior is proportional to likelihood times prior:

\text{posterior} \propto \text{likelihood} \times \text{prior}

The proportionality drops the normalizing constant $P(x) = \int P(x \mid \theta) P(\theta)\, d\theta$ , which is often hard to compute and unnecessary if you only care about relative posterior values.

A worked example: you want to estimate the bias $\theta$ of a coin. Pick a uniform prior $\theta \sim \mathrm{Beta}(1, 1)$ . Observe $7$ heads in $10$ flips. The likelihood is Binomial. Because Beta is conjugate to Bernoulli/Binomial, the posterior is just $\mathrm{Beta}(1 + 7, 1 + 3) = \mathrm{Beta}(8, 4)$ , with mean $8/12 \approx 0.67$ . Updating Bayesian beliefs is often this simple — when you pick conjugate priors.

Bayes' Rule applied to parameters reads

P(\theta \mid x) = \frac{P(x \mid \theta)\, P(\theta)}{P(x)}

In words, posterior is proportional to likelihood times prior:

\text{posterior} \propto \text{likelihood} \times \text{prior}