Regression Diagnostics — Section 11: Regression Analysis

Residuals are more informative than the headline $R^2$ . A few key diagnostics:

Linearity: residuals vs fitted values should look like a featureless cloud, not a curve. A pattern means you're missing nonlinearity.
Homoscedasticity: residual variance shouldn't depend on $\hat{y}$ . If it does, use weighted least squares or robust ("sandwich") standard errors.
Normality of residuals: check via a Q-Q plot. Matters mostly for $t$ -test inference, not for the point estimates themselves.
Independence: for time series, plot residuals over time and check the Durbin-Watson statistic for serial correlation.

Influential points are another quiet hazard:

Leverage: how unusual is $x_i$ in predictor space? Distant $x$ values pull the fit.
Cook's distance combines leverage and residual size to quantify a single point's influence on the fitted coefficients.

A handful of high-leverage points can drive an entire regression. Always plot the diagnostics, especially before trusting a regression on financial data with outliers.

Residuals are more informative than the headline $R^2$ . A few key diagnostics:

Linearity: residuals vs fitted values should look like a featureless cloud, not a curve. A pattern means you're missing nonlinearity.
Homoscedasticity: residual variance shouldn't depend on $\hat{y}$ . If it does, use weighted least squares or robust ("sandwich") standard errors.
Normality of residuals: check via a Q-Q plot. Matters mostly for $t$ -test inference, not for the point estimates themselves.
Independence: for time series, plot residuals over time and check the Durbin-Watson statistic for serial correlation.

Influential points are another quiet hazard:

Leverage: how unusual is $x_i$ in predictor space? Distant $x$ values pull the fit.
Cook's distance combines leverage and residual size to quantify a single point's influence on the fitted coefficients.

A handful of high-leverage points can drive an entire regression. Always plot the diagnostics, especially before trusting a regression on financial data with outliers.