激励相容 · Incentive Compatibility

"好的规则,让每个人自私的算计,恰好通向你想要的结果。"

机制设计是"反向的博弈论":博弈论给定规则、推演人们会怎么玩;机制设计反过来,先定好你想要的结果,再倒推出能逼出这个结果的规则。它的核心目标就是激励相容——设计游戏规则,让每个参与者追求自身利益的那个动作,恰好就是你希望他做的动作。讲真话、肯出力、不钻空子,成为他的占优策略,而不是道德要求。

非平凡点:① 它把"靠人自觉"换成"靠结构自动"——你不必假设参与者高尚,只需让背叛无利可图。② 有一条强力的显示原理:任何机制能达成的结果,都能由一个"讲真话就是最优"的简单机制复现。这意味着设计者可以只盯着一个问题——怎样让说真话变得划算。③ 它的反面是古德哈特定律:"当指标变成目标,它就不再是好指标。" 一旦激励不相容,人们就会优化指标本身、而非指标背后的真实目的——这正是机制失败的典型征兆。

经典例子

"我切你选"的分蛋糕规则:负责切的人知道对方会先挑,为了自己那块不吃亏,他必然尽量切得均等。没人讲公平,公平却自动发生——规则把"自私"导向了"公正",这就是激励相容最朴素的样子。

场景 · BigCat

① 设计 KPI/OKR 时,先问一句:"如果团队只优化这个数字、完全不管我的真实意图,会发生什么?" 答案若荒谬,这套激励就不相容,必被钻空子(用例数刷上去、质量塌下来)。② RLHF 的奖励模型本质就是一道机制设计题:若奖励可被"reward hacking",模型会学会讨好评分而非真正有用——对齐之难,一半难在设计一个无法被钻空子的、激励相容的奖励。


Incentive Compatibility — mechanism design is "reverse game theory": instead of taking the rules and predicting play, you fix the outcome you want and engineer rules that force it. A mechanism is incentive-compatible when each participant's self-interested action is exactly the action you want — telling the truth, exerting effort, not gaming the system becomes their dominant strategy, not a moral demand. It replaces "trust people to be good" with "make defection unprofitable." The revelation principle says any achievable outcome can be implemented by a simple truth-telling mechanism, so the designer can focus on one question: how to make honesty pay. Its failure mode is Goodhart's Law — when a measure becomes a target, it ceases to be a good measure; once incentives misalign, people optimize the metric instead of the goal behind it (e.g., reward hacking in RLHF).

中文提示词
我要为 [团队 / 产品 / 协作 场景] 设计一套激励或考核规则,目标是 [你真正想要的结果],候选指标是 [现有指标]。请用「激励相容」帮我压力测试: ① 假设参与者只会冷酷地优化这个指标、无视我的真实意图,他会怎么钻空子?(古德哈特定律) ② 怎么改造规则,让"说真话 / 真出力"成为他的占优策略,而不是靠自觉? ③ 给我 1 个更激励相容的替代设计,并指出它牺牲了什么。
English Prompt
I'm designing incentives/metrics for [team / product / collaboration], aiming for [the outcome I actually want], with candidate metrics [current metrics]. Use Incentive Compatibility to stress-test: 1. If participants coldly optimize this metric while ignoring my true intent, how would they game it? (Goodhart's Law) 2. How can I redesign the rules so that telling the truth / genuinely exerting effort becomes their dominant strategy, not a matter of goodwill? 3. Give me one more incentive-compatible alternative, and name what it trades off.

委托代理 · Principal-Agent Problem

"你雇人替你做事,但他看得见的、想要的,和你都不一样。"

委托代理问题:委托人(principal)请代理人(agent)替自己行事,但两者之间横着两道裂缝——目标不一致(代理人有自己的小算盘)和信息不对称(委托人看不清代理人到底有没有尽力)。这道结构性裂缝衍生出两类经典风险:事后的道德风险(签约后偷懒、谋私,反正你看不见),和事前的逆向选择(代理人隐藏自己的真实"类型",浑水摸鱼者反而更愿意来)。

非平凡点:① 问题的根不是"人坏",而是"看不见"——信息若完全透明,按结果验收即可,代理问题自然消失。② 主流解法是把代理人的回报与可观测的结果挂钩(提成、期权、对赌),但这会把风险转嫁给代理人——于是永远要在"强激励"与"让代理人承担过多风险"之间权衡。③ 委托链可以层层嵌套:股东→董事会→CEO→中层→员工,每一环都漏一层激励,越长的代理链,目标漂移得越远

委托代理:两道裂缝 委托人 (想要结果) 代理人 (有自己算盘) 委托 · 付酬 裂缝① 目标不一致 裂缝② 信息不对称(看不见努力) 补缝:把报酬挂到可观测结果上 + 监督
看不见努力,就得把报酬绑到看得见的结果上——代价是把风险转给代理人
经典例子

股东与职业经理人:股东要长期价值,经理人可能更想要短期账面好看、自己的奖金和地位。股东无法时刻盯着每个决策,于是用股票期权把经理人的钱包绑到公司股价上——让"为公司好"也变成"为自己好"。

场景 · BigCat

① 你委托一个 AI agent 去执行复杂任务,它的隐性目标(尽快交差、少耗 token)未必等于你的目标(把事做扎实)——这正是 AI 对齐的微缩版委托代理。补缝的办法同样是"挂结果 + 加监督":明确可验收的成功标准,关键步骤要求它留下可检查的痕迹。② 外包、招人、远程协作同理:与其事无巨细地监工(信息成本极高),不如把报酬结构设计成"代理人做对事时他自己最受益"。


The Principal-Agent Problem — a principal hires an agent to act on their behalf, but two cracks sit between them: misaligned goals (the agent has private interests) and asymmetric information (the principal can't see whether the agent truly exerted effort). This breeds moral hazard (shirking or self-dealing after the contract, since it's unobservable) and adverse selection (agents hide their true "type" before the contract; the worst types are keenest to apply). The root isn't bad people but invisibility — with perfect information you'd just verify by outcome. The standard fix is tying the agent's payoff to observable results (commission, equity, pay-for-performance), but that pushes risk onto the agent, forcing a permanent trade-off between strong incentives and overloading the agent with risk. Agency chains nest (shareholders → board → CEO → staff), and the longer the chain, the further goals drift. AI alignment is a principal-agent problem in miniature.

中文提示词
我(委托人)要把 [任务 / 职责] 交给 [代理人:员工 / 外包 / AI agent / 合作方] 去做,我真正想要的是 [目标]。请用「委托代理」帮我设计: ① 这里的"目标不一致"和"信息不对称"分别长什么样?道德风险和逆向选择各可能出现在哪? ② 我能观测到哪些结果?怎么把报酬 / 验收绑到这些可观测信号上,让他做对事时自己最受益? ③ 强激励会把多少风险转给代理人?给我一个在"激励力度"和"风险承担"之间平衡的方案。
English Prompt
As the principal, I'm delegating [task / responsibility] to [agent: employee / contractor / AI agent / partner]; what I truly want is [goal]. Use the Principal-Agent model to design: 1. What do the goal misalignment and information asymmetry look like here? Where might moral hazard and adverse selection show up? 2. Which outcomes can I observe? How do I tie pay/acceptance to those observable signals so the agent benefits most by doing the right thing? 3. How much risk does strong incentive shift onto the agent? Give me a design balancing incentive strength against risk-bearing.

拍卖理论 · Auction Theory

"设计得当的拍卖,能逼出连出价人自己都未必愿说的真实估值。"

拍卖远不只是"价高者得",它是一台价格发现 + 信息揭示的机器:在没人愿意透露底牌时,用规则把分散在各人脑中的真实估值"逼"出来。最反直觉的一招是次价拍卖(密封出价、最高者赢、但只付第二高的价):因为你付多少和你出多少脱了钩,瞒报或虚高都没好处,照真实估值出价反而成了占优策略。这正是把激励相容用在了定价上。

非平凡点:① 收益等价定理:在理想条件下,英式、荷式、首价、次价等多种形式,给卖家带来的期望收益其实一样——形式之争往往不如想象中重要,真正要紧的是吸引足够多的认真竞买者。② 赢家诅咒:当标的真实价值人人未知(共同价值,如油田、并购),赢的那个往往恰恰是高估最离谱的那个——赢得竞拍本身就是"你出价太高了"的坏消息。理性的应对是预先把出价向下修正。

经典例子

互联网广告的实时竞价:每次你打开页面,背后是无数广告主在毫秒间为这个曝光位出价。平台普遍采用次价式规则,正是为了让广告主放心按真实价值出价、无需费心揣测对手——机制设计直接变成了千亿级生意的地基。

场景 · BigCat

① 云上的竞价实例(spot instance)就是一场实时拍卖:闲置算力低价竞得,但价高时会被回收。理解拍卖机制,才能把容错任务放上去省成本、把关键任务留在稳定资源上。② 团队内部分配稀缺资源(GPU 配额、专家工时、好项目),与其靠争抢或领导拍板,不如设计一场轻量拍卖(如用虚拟积分竞价),让真正最需要的人显形。③ 在并购、竞标、抢人时,时刻警惕赢家诅咒:你赢了,很可能只是因为你比所有人都更高估了它。


Auction Theory — an auction isn't just "highest bidder wins"; it's a price-discovery and information-revelation machine that extracts truthful valuations no one wants to disclose. The most counterintuitive design is the second-price (Vickrey) auction: sealed bids, highest wins, but pays the second-highest price. Because what you pay is decoupled from what you bid, neither lowballing nor inflating helps — bidding your true value becomes the dominant strategy (incentive compatibility applied to pricing). The Revenue Equivalence Theorem shows that under ideal conditions English, Dutch, first-price, and second-price auctions yield the same expected revenue, so attracting enough serious bidders matters more than the format. The Winner's Curse warns that in common-value auctions (oil fields, M&A) the winner is often whoever overestimated most — winning is itself bad news, so rational bidders shade their bids down.

中文提示词
我面对一个"如何分配 / 定价稀缺资源"的问题:[描述资源、参与者、以及你想要的结果——效率最高 / 收益最大 / 最需要的人拿到]。请用「拍卖理论」帮我设计: ① 用哪种拍卖形式合适?为什么?次价式能不能让大家放心出真实估值? ② 这里有没有"赢家诅咒"风险(标的真实价值未知、靠抢)?我该把出价向下修正多少? ③ 比起纠结形式,我更该做什么来提升结果(如吸引更多认真的竞买者)?
English Prompt
I'm facing how to allocate/price a scarce resource: [describe the resource, the participants, and the outcome I want — max efficiency / max revenue / the one who needs it most gets it]. Use Auction Theory to design: 1. Which auction format fits, and why? Would a second-price design let people bid their true value safely? 2. Is there a Winner's Curse risk here (unknown common value, won by grabbing)? How much should I shade bids down? 3. Beyond agonizing over format, what matters more for the outcome (e.g., attracting more serious bidders)?

信号传递 · Signaling

"可信的信号,一定是冒牌货模仿不起的那种——贵在'造假成本'。"

当一方握有私人信息(我真的很优秀 / 我的产品真的耐用),而对方无法直接核验时,怎样才能可信地传递?答案不在"说什么",而在"做什么——而且这件事冒牌货模仿不起"。关键是成本不对称:一个信号只有当它对真货便宜、对假货昂贵到不划算时,才不可伪造,市场才会把"发出信号者"和"沉默者"区分开(分离均衡)。嘴上的承诺人人会说,所以一文不值;烧钱的行动假货学不来,所以才值钱。

非平凡点:① 著名的文凭信号理论:一纸学历的价值,有相当一部分不在于学到了什么,而在于"能熬下来"这件事本身就筛掉了能力或毅力不足的人——它是难以伪造的能力信号,哪怕课程内容早已忘光。② 这解释了大量"看似浪费"的行为为何理性存在:保修期、无条件退货是质量信号;创业者押上自己身家是风险共担信号。③ AI 时代的新含义:当文字、代码、作品都能被大模型批量生成、近乎零成本,"廉价信号"集体贬值,那些成本不对称、难以批量伪造的信号(真实的长期投入、可验证的过往战绩)反而更值钱。

经典例子

孔雀那条又大又沉、招天敌的尾巴:正因为它是个"累赘",体弱的雄孔雀根本养不起,所以它成了诚实的健康信号——母孔雀据此择偶不会被骗。生物学称之为"累赘原理":信号的可信,恰恰来自它高昂到假货承受不起的代价。

场景 · BigCat

① 对技术人而言,一份扎实的开源贡献、可考的项目战绩,是比任何自我介绍都强的能力信号——它难以伪造,因为它是长期真实投入的沉淀。② 招聘与合作中,与其听对方怎么说,不如看他愿意付出什么"成本不对称"的行动:愿不愿意先做一个小样、愿不愿意风险共担。③ 当你要在 AI 泛滥的内容里建立可信度,别再堆砌廉价产出——用别人模仿不起的东西做信号:原创的深度洞察、公开可验证的长期记录。


Signaling — when one party holds private information (I'm truly skilled / my product truly lasts) that the other can't verify, credibility comes not from what you say but from doing something a fake couldn't afford to imitate. The key is cost asymmetry: a signal is unfakeable only when it's cheap for the genuine type and prohibitively expensive for the fake, which lets the market separate signalers from the silent (a separating equilibrium). Talk is cheap, so it's worthless; costly action can't be mimicked, so it carries information. The classic education signaling view holds that a degree's value lies partly not in what was learned but in the fact that enduring it screens out those lacking ability or grit. This rationalizes "wasteful" behaviors: warranties signal quality; founders risking their own capital signal skin in the game; the peacock's costly tail (the handicap principle) signals fitness. In the AI era, as text, code, and work become near-free to mass-generate, cheap signals collapse in value while cost-asymmetric, hard-to-fake ones (real long-term investment, verifiable track records) become more valuable.

中文提示词
我想向 [对象:雇主 / 客户 / 合作方 / 市场] 可信地传递 [我的某种难以直接核验的素质:能力 / 质量 / 诚意 / 实力],但空口无凭。请用「信号传递」帮我设计: ① 哪些信号是"成本不对称"的——对真实的我便宜、对冒牌货昂贵到学不来? ② 我现在用的信号是不是"廉价信号"(谁都能说 / 能批量造),因而正在贬值? ③ 给我 2 个难以伪造的高可信信号(如风险共担、可验证的过往战绩、先做小样),并排出投入产出优先级。
English Prompt
I want to credibly signal [a hard-to-verify quality of mine: skill / quality / sincerity / capability] to [audience: employer / client / partner / market], but talk alone won't do. Use Signaling to design: 1. Which signals are cost-asymmetric — cheap for the real me, prohibitively expensive for a fake to imitate? 2. Are my current signals "cheap signals" (anyone can claim / mass-produce them) and therefore depreciating? 3. Give me 2 hard-to-fake high-credibility signals (e.g., skin in the game, a verifiable track record, doing a small sample first), and rank them by ROI.