Covariance measures how two variables move together. Correlation rescales it to for comparability across units.
Covariance
. Positive when and tend to be above (or below) their means together. Negative when one is above and the other below. Sign matters; magnitude is in units of — hard to interpret.
Pearson correlation
. Dimensionless, in . means perfect positive linear relationship, perfect negative, no LINEAR relationship — but and could still be perfectly dependent in a nonlinear way (think on symmetric data: ).
Spearman correlation
Pearson applied to the ranks rather than the values. Measures monotonic (not necessarily linear) relationships. Robust to outliers. Use when the relationship is monotonic but possibly curved.
Correlation is not causation
The most-quoted mantra in statistics. High correlation could mean: causes , causes , both are caused by a third variable , or pure coincidence in a small sample. Causal claims require either an intervention (randomized experiment) or strong identifying assumptions (instrumental variables, regression discontinuity, difference-in-differences).
Multicollinearity
When predictors in a regression are highly correlated with each other, the model's coefficients become unstable — small changes in data shift them wildly. Detect with variance inflation factors (VIF); fix by dropping variables, combining them (PCA), or regularization.
Partial correlation
The correlation between and controlling for one or more other variables. Useful for asking "is the apparent relationship between and explained by ?"