The Law of Large Numbers in one line: as the number of independent repeated trials grows, the sample mean converges to the expected value. In other words, short-run results are full of randomness; long-run results trend toward certainty. It's one of the most foundational — and most profound — theorems in probability.
The real power of the law lies in its converse: small samples are deeply unreliable. Most everyday cognitive errors come from over-generalizing on small samples — reading three articles and drawing a conclusion, trying something twice and declaring it doesn't work, watching one quarter and forecasting the whole year. The law tells us: stay humble until your sample is large enough. It also implies a practical strategy — iteration count is itself an advantage. People who can run many trials quickly and cheaply are closer to the truth.
One subtle point: the Law of Large Numbers is not the gambler's fallacy. It doesn't guarantee that you'll "recover" or "balance out" in the short run — it only guarantees mean convergence over a long enough horizon. Conflating the two is one of the most dangerous misreadings in everyday decisions.
The Law of Large Numbers states that as trial count grows, the sample mean converges to the expected value. Small samples are unreliable and prone to misleading conclusions. The practical implication: increase your iteration count before drawing judgments, and never confuse short-run randomness with long-run certainty.
Regression to the mean is one of the most counterintuitive phenomena in statistics: when a variable shows an extreme measurement (very high or very low), the next measurement tends to land closer to average. Not because some "corrective force" is at work — but because extreme outcomes typically contain a large random component, and randomness rarely repeats itself exactly.
Francis Galton discovered this while studying the heights of parents and children: very tall parents tended to have somewhat-less-tall children, and very short parents had somewhat-taller children. He called it "regression toward mediocrity." The most dangerous trap with regression is false causal attribution: an employee performs badly, you criticize them, they improve next time — you think the criticism worked. They perform well, you praise them, they regress — you think praise made them lazy. In reality, both might be pure statistical regression, with no causal link to your intervention.
Understanding regression keeps you from overreacting to random fluctuations — in investing, in management, in parenting. The "regression" after extreme performance is nearly inevitable and needs no causal explanation.
Regression to the mean explains why extreme performances — good or bad — tend to be followed by more average outcomes. The trap: we invent causal stories for what is merely statistical inevitability. Recognizing this prevents overreacting to outliers and misattributing results to interventions that had no real effect.
The power law (also known as Pareto or long-tail distribution) describes radically uneven distributions: a tiny minority of nodes or events accounts for the vast majority of influence or resources. Unlike the normal distribution (the bell curve), where most values cluster around the mean, in a power-law world the mean is nearly meaningless — the influence of extremes vastly exceeds the "average."
Power laws are everywhere: 1% of papers get 50% of citations, 0.1% of videos get 90% of views, a handful of cities hold most of the population, a few earthquakes cause most of the damage. This isn't accident — it's typically driven by "rich-get-richer" positive feedback (in physics, "preferential attachment"). The practical implication is profound: in a power-law world, your strategy should be to concentrate resources on a few high-leverage bets, not spread evenly. A normal-distribution world rewards "don't make mistakes"; a power-law world rewards "find the 10x winner."
Power-law thinking also reveals a counterintuitive fact: in power-law systems, the "average" is a misleading metric. The "average return" of a VC fund is meaningless, because a few super-winners contribute nearly all the profit.
Power Law distributions mean a tiny minority captures the vast majority of outcomes — wealth, citations, returns, impact. Unlike bell curves, averages are meaningless here. The strategic implication: concentrate resources on finding and maximizing the few high-leverage opportunities rather than spreading effort evenly.
The human brain's default mode is linear thinking: double the input, double the output; bigger cause, bigger effect. But most real-world systems are nonlinear — they have thresholds, exponential growth, S-curves, phase transitions, chaos. Linear extrapolation is one of the most common thinking errors we make.
Core features of nonlinear systems: (1) thresholds — water from 99°C to 100°C is a qualitative change, not quantitative; (2) exponential growth — compounding, viral spread, network effects are imperceptible early and explode after the inflection; (3) sensitive dependence — the butterfly effect, where tiny differences in initial conditions produce wildly different outcomes; (4) emergence — the whole displays new properties that can't be predicted from the sum of the parts.
Nonlinear thinking demands: stop using simple linear ratios to predict complex systems. When you see "input and output aren't proportional," don't be confused — that's the normal state of nonlinear systems. Look for thresholds and leverage points where a small force tips a big change.
Nonlinear thinking recognizes that most real-world systems don't follow proportional cause-and-effect. They exhibit thresholds, exponential growth, phase transitions, and emergent properties. Linear extrapolation — our brain's default — fails catastrophically in these systems. Look for tipping points and leverage points where small inputs produce outsized effects.