When you want to compare means between groups, a small set of standard tests covers nearly every case. Knowing which test fits which scenario is half the battle.
One-sample t-test
Does the population mean differ from some hypothesized value? , with degrees of freedom. Use when you have one sample and a reference value.
Two-sample t-test
Compare means of two independent groups. Variants: pooled (assume equal variances) or Welch's (don't). Welch's is the safer default; use it unless you have a specific reason to assume equal variance.
Paired t-test
Two measurements on the SAME subjects (e.g., before/after). Compute differences, run a one-sample t-test on them. Much more powerful than a two-sample t-test when the within-subject correlation is high.
ANOVA
Comparing means across three or more groups. Tests whether AT LEAST ONE group's mean differs from the others. If significant, follow up with post-hoc tests (Tukey HSD, Bonferroni) to identify WHICH groups differ. Multiple t-tests without ANOVA inflates Type I error.
Non-parametric variants
When data is non-normal or has outliers, use rank-based tests: Mann-Whitney U (two groups), Wilcoxon signed-rank (paired), Kruskal-Wallis (ANOVA replacement). Less powerful when data IS normal but more robust when it isn't.
Chi-squared
For categorical data: does the observed frequency table differ from expected? Two flavors: goodness-of-fit (one categorical variable, hypothesized distribution) and independence (two categorical variables, are they associated?).
Assumption checks
Before running a test, check the assumptions: normality (Q-Q plot or Shapiro-Wilk), equal variances if applicable (Levene's), independence (study design). With , the CLT covers most normality concerns; with smaller samples, robust or non-parametric alternatives are often safer.