Shannon entropy quantifies the average uncertainty of a discrete random variable:
Units depend on the log base — bits for , nats for . A fair coin has bit; a biased coin has less.
Properties: , with equality iff is constant. For a uniform distribution on outcomes, — entropy is maximized at maximum uncertainty.
Entropy is the lower bound on the average bits needed to encode samples from (Shannon's source-coding theorem) — the foundation of compression. It also pops up in machine learning loss functions (cross-entropy), in feature-selection scores, and in statistical mechanics.