Logistic regression models the probability of a binary outcome as a logistic function of a linear combination of inputs. Despite the name, it's a classification model — the most-used baseline for tabular classification problems.
The model
where is the logistic sigmoid. The linear combination is squashed into to be a valid probability.
Log-odds
The model is linear in the log-odds: . Each coefficient is the change in log-odds per unit change in . Exponentiating gives an odds ratio: is the multiplicative change in odds per unit of .
Estimation
Maximize the log-likelihood: . No closed form; solved iteratively (Newton-Raphson, gradient descent). Convex in , so the global optimum is unique.
Decision threshold
For a binary decision, classify if (or any other threshold). Different thresholds trade off precision and recall — for imbalanced classes or asymmetric costs, the optimal threshold isn't 0.5.
Multinomial logistic
Extension to classes. One linear combination per class, softmaxed: . Equivalent to "softmax regression" or the last layer of a neural network.
Why logistic, not linear
You could fit a linear model directly to 0/1 labels (LPM). It gives unbiased coefficients but predicted probabilities can exceed , and the constant-variance assumption is wildly violated. Logistic is the right model when the outcome is genuinely binary.