A random variable assigns a numeric value to each outcome of a random experiment. We don't care about the mechanism producing the value — we care about the distribution over its possible values.
Discrete vs continuous
A discrete random variable takes on values from a countable set (the result of a die roll, the number of customers in a queue). A continuous random variable can take any value in a range (height, temperature). The distribution is described by a probability mass function (PMF) in the discrete case and a probability density function (PDF) in the continuous case.
Common discrete distributions
- Bernoulli(): 1 with probability , 0 with probability . A single coin flip.
- Binomial(): number of successes in independent Bernoulli trials.
- Poisson(): count of events in a fixed interval when events happen at average rate .
- Geometric(): number of trials until the first success.
Common continuous distributions
- Uniform(): all values in equally likely.
- Normal(): the Gaussian "bell curve" — central limit theorem makes it appear everywhere.
- Exponential(): time between events in a Poisson process; memoryless.
- Beta(): distribution over probabilities; the conjugate prior for the Bernoulli.
CDF
The cumulative distribution function works for both discrete and continuous. It's always non-decreasing, starts at 0 (), ends at 1 (). uniquely determines the distribution.
Why this matters
Every model in statistics and machine learning assumes some distribution — usually as a prior, a likelihood, or a noise term. Knowing the standard distributions and when they apply lets you read papers, debug models, and reason about which assumption you're violating.