Meta Knowledge: Game Theory

May 25, 2026 · Meta Knowledge
DAY 11
Game Theory Mathematical Economics Evolutionary Biology Mechanism & Institution Design

Nash Equilibrium

Nash Equilibrium
The stable point no one is tempted to break
CORE INSIGHT

"Rational" isn't "do what's best for me" — it's "given that everyone acts as currently predicted, I have no incentive to unilaterally change my strategy." An equilibrium isn't necessarily optimal — it's simply where no one has a reason to move first. Many social traps (arms races, price wars, chronic overtime) are stable Nash equilibria: everyone wants out, yet no one dares leave first.

BACKGROUND & MECHANISM

In 1950 Nash proved that every game has at least one such "stable point" — a state where no one can make themselves better off by changing strategy alone. Before him, people only solved zero-sum "winner-take-all" games; Nash extended it to all non-cooperative games, laying the micro-foundations of modern economics. The key point: an equilibrium doesn't predict "who wins" — it describes a self-consistent state where everyone's expectations of each other are confirmed.

▸ Prisoner's dilemma: the only equilibrium ≠ the best outcome for both
B cooperatesB defects
A cooperates−1, −1−10, 0
A defects0, −10−5, −5
Red = the only Nash equilibrium (both defect); green = the outcome that's better for both (both cooperate). Each side's rationality pushes them to the worse result.
COUNTER-INTUITIVE EXAMPLE

Braess's paradox: building a new shortcut in a congested road network can make everyone's travel time longer. Because each driver picks the route that's "fastest for me," they form a new, worse equilibrium. In reality, New York and Seoul have both shortened average commutes by closing certain roads — removing an option pushed the whole system toward a better equilibrium.

CROSS-DOMAIN TRANSFER

In biology it shows up as the "evolutionarily stable strategy" (see card 3); in AI, a GAN is a generator and a discriminator finding an equilibrium by competing, and AlphaGo's self-play is also an equilibrium search; in the Cold War, "mutually assured destruction" was a stable equilibrium no one dared break first; in ad auctions, the result of a search engine's bidding is itself an equilibrium.

BIGCAT APPLICATION + REFLECTION

"Over-promise → slip → over-promise again" on a team is a classic Nash equilibrium: everyone knows the estimates are too optimistic, but whoever gives a conservative estimate first gets sidelined first. To break it, don't exhort people to "be more honest" — change the payoffs: publicly track "estimate vs. actual" so accuracy becomes a visible, scarce skill. Same in parenting: a child and parent stuck in a "nag → stall" equilibrium won't be moved by reasoning — only changing the rules (natural consequences instead of nagging) shifts the equilibrium.

▸ Reflection: the most exhausting "recurring tug-of-war" in your team or family — which Nash equilibrium is it really? To break it, which cell's payoff must change?

Repeated Games & Evolution of Cooperation

Repeated Games & Evolution of Cooperation
How cooperation emerges among the self-interested
CORE INSIGHT

In a one-shot game, "defect" is rational; but if the game repeats indefinitely and the future matters enough, cooperation itself can become the equilibrium — with no morality, contract, or third-party enforcement required. Civilization, trust, reputation, and long-term relationships are all, at bottom, short-term games embedded into a long-term structure.

BACKGROUND & MECHANISM

In theory, as long as a game keeps repeating, almost any outcome can stabilize — cooperation, half-cooperation, taking turns exploiting each other. The most famous experiment was a "repeated prisoner's dilemma" tournament, won by a 4-line strategy: "Tit-for-Tat." Its winning recipe: open with goodwill, retaliate immediately when betrayed, forgive the moment the other cooperates, and stay transparent and easy to read.

COUNTER-INTUITIVE EXAMPLE

In WWI trenches a strange "live and let live" truce emerged: opposing soldiers would fire into the air at fixed times, and even celebrated Christmas together. The reason is simple — the same units facing each other for a long time turned the war into a repeated game, and cooperation arose on its own. The military's countermeasure was precisely to rotate units constantly and force night raids — artificially turning the "long game" back into a "one-shot game," and cooperation collapsed. Structure drives behavior, far more than slogans do.

CROSS-DOMAIN TRANSFER

In business, a long-term supply-chain relationship is completely different from one-off haggling; in blockchain, staking-and-slashing makes "cheating = permanently losing your deposit," forcing participants into a repeated game; in diplomacy, sustained unilateral goodwill can elicit reciprocity; in everyday life, reputation systems (ratings, reviews) embed every transaction into a long game.

BIGCAT APPLICATION + REFLECTION

To judge "will this person screw me over," a more reliable question than gut instinct is: is this a one-shot game or a repeated one? A stranger driving a rideshare has almost no incentive to cheat, because the rating system folds every trip into long-term reputation; an employee's final performance review before quitting is a classic one-shot game — adjust your expectations accordingly. On a team, turning "ad-hoc, freshly-assembled" projects into "standing teams with a continuous track record" often boosts cooperation directly, without any new policy — structure beats preaching by an order of magnitude.

▸ Reflection: which important relationship of yours is structurally a one-shot game, yet you're investing in it as if it were long-term? And conversely, which one should you treat as one-shot, but you're trapped by the illusion of a "long-term relationship"?

Evolutionarily Stable Strategy

Evolutionarily Stable Strategy (ESS)
No "rationality" needed — just copy the winners
CORE INSIGHT

No one needs to understand "rationality" — as long as "successful players get copied," a population still converges to the game's equilibrium. Genes, cultures, and corporate strategies are all playing games, yet none of them understands game theory. An "equilibrium" is a product of selection pressure, not of deliberate reasoning.

BACKGROUND & MECHANISM

A strategy is "evolutionarily stable" when, once nearly everyone adopts it, any new "invading" strategy can't gain an edge and dies out. It's stricter than an ordinary Nash equilibrium — it requires not only "no one wants to deviate" but also "robustness to small disturbances." Dynamically: whichever strategy is doing better right now grows faster as a share of the population, eventually converging on this stable point.

COUNTER-INTUITIVE EXAMPLE

The "hawk-dove game" assumes fighting costs more than the food is worth: the population ends up neither all aggressive "hawks" (mutual injury is too costly) nor all yielding "doves" (a single hawk among doves wins everything), but stable at some mixed ratio. This explains why most animal conflict is bluffing ritual (roaring, posturing) rather than real fighting. The same logic explains why sex ratios trend toward 1:1 — whichever sex is rarer is more "profitable" to produce, so the ratio is automatically pulled back.

CROSS-DOMAIN TRANSFER

In culture it explains why "follow the crowd" and "imitate the successful" can coexist long-term; in AI, multi-agent self-play also converges to such stable strategies; in business, disruptive breakthroughs often come from non-mainstream players — because a big company's "explore vs. defend" ratio is itself a stable equilibrium; in medicine, it explains why drug-resistant bacteria don't vanish immediately after you stop the drug.

BIGCAT APPLICATION + REFLECTION

Personal career choices are often misled by this intuition: when everyone floods into the "seemingly optimal" track, crowding drives its payoff down, and it's the few on the "suboptimal" path who reap the high returns. Especially true in the AI era — "everyone pivoting to one hot skill" can never be a stable strategy; a skill that's quickly copied loses value fast. For the "AI super-individual," the real question isn't "which skill is hottest now," but "which combination stays solid even after everyone follows" — usually a hard-to-imitate cross-domain mix: deep tech × industry insight × communication/teaching ability.

▸ Reflection: your most sought-after ability right now — does it still hold an edge in a population where everyone around you has it too? What combination is a stable strategy that belongs only to you?

Mechanism Design

Mechanism Design
Reverse game theory · turning social engineering into design
CORE INSIGHT

Ordinary game theory asks "given the rules, what will people do"; mechanism design flips it: "I want a certain outcome — what rules should I design?" It's "reverse game theory" — knowing full well that everyone will game the system (and that everyone knows everyone is gaming it), yet still designing rules where telling the truth is the most profitable move and overall efficiency is highest. This is the holy grail of institution engineering.

BACKGROUND & MECHANISM

Two cornerstones: first, any ideal outcome can be achieved with rules where "honestly stating your true preference pays best"; second, "incentive compatibility" — making each participant's optimal strategy be to tell the truth. The classic example is the "second-price auction": the highest bidder wins, but pays only the second-highest price. Now neither inflating nor lowballing your bid helps, so "bid your true value" becomes the optimal strategy. Most ad auctions today are built on this principle.

COUNTER-INTUITIVE EXAMPLE

A famous theorem proves that with more than 3 candidates, no voting rule can fully eliminate "strategic voting" (abandoning a favorite, holding your nose to vote for a second choice). In other words, a "perfectly fair democratic vote" simply doesn't exist mathematically — we can only pick the "least bad" among many imperfect rules. This reduces a political dilemma to a fundamental limit of mechanism design itself.

CROSS-DOMAIN TRANSFER

In resource allocation, algorithms can perform stable "matching" (used to pair medical students with hospitals, students with schools); in blockchain, reward-and-penalty rules are designed so "cheating doesn't pay"; in organizations, "ambitious goals not hard-tied to bonuses" is designed precisely to stop people from deliberately lowballing targets; in climate, carbon pricing and emissions trading are mechanism design put into practice.

BIGCAT APPLICATION + REFLECTION

Leading a team, building a product, setting rewards — all of it is mechanism design. The most common mistake is equating "KPI design" with "mechanism design" — thinking only about "what I want" and never asking "how will participants game this KPI?" Measure support staff by call volume → they'll hang up fast; measure engineers by lines of code → the code bloats. For every incentive, ask in reverse: if I were smart and purely self-interested, what's the laziest way to max out this metric? Does it match what I actually want? Same with kids: rewarding "number of problems done" breeds going-through-the-motions drilling; rewarding "explaining a hard problem to someone else" makes real learning the optimal strategy.

▸ Reflection: an incentive you currently use (a KPI, points, a bonus, a house rule) — if you handed it to a brilliant, purely self-interested person to maximize, what "perfect but absurd" outcome would pop out?