Principal Component Analysis (PCA) decomposes a high-dimensional dataset into orthogonal directions ordered by how much variance each captures.
Given a centered data matrix , the principal components are the eigenvectors of the covariance matrix , with eigenvalues equal to the variances along those directions:
The first principal component points in the direction of maximum variance. The second is orthogonal to the first and captures the next-most variance, and so on.
PCA is everywhere in finance. The first few components of a global stock-return matrix reveal the systematic factors driving the market — the first usually looks like "the overall market," the second like "growth vs value," and so on. PCA-based factor models are foundational for risk management.
Watch out: PCA is a linear method, sensitive to the scale of variables, and the components have no inherent interpretability. If two variables have very different units, scale them first.