Every hypothesis test can be wrong in two distinct ways, and the framework names both:
A Type I error rejects when it's actually true — a false alarm. Its rate is , the significance level you chose.
A Type II error fails to reject when is actually true — a missed detection. Its rate is .
Power = , the probability of correctly rejecting a false . Higher power means you'll catch real effects more often.
There's a fundamental trade-off: lowering (fewer false alarms) raises (more misses), and vice versa. The only ways to reduce both at once are to collect more data or improve the signal-to-noise of your measurement.
In trading, low Type II error often matters more than low Type I. Missing a real alpha signal is usually costlier than a few false alarms — especially when each "test" is cheap to validate further.