Day 03 · 2026.06.03

The Craft of Perf Review: Compressing a Year Into Two Pages, and Two Pages Into One Conversation

Topic: Performance Review·4 Principles
"Your output as a manager is the output of your organization plus the output of the neighboring organizations under your influence." — Andy Grove
This week's premise: Perf review is the two hours your report will remember longest from the entire year. Ten years from now they will have forgotten the projects you assigned, but they will remember whether you looked them in the eye when you said "Meets," and they will remember whether your answer to "why not Exceeds" actually addressed the question. Most managers wreck this in four places: (1) writing it like a status report instead of a judgment, (2) arriving at calibration without a story that survives the room, (3) not seeing the bias they're smuggling into their ratings, and (4) sprinting away from the "why not higher" conversation. This week unpacks all four — from Andy Grove's task-relevant maturity to the calibration practices at Lattice and Google to Amy Edmondson on how one botched review can destroy psychological safety overnight. By the end, you should be able to open one of your reports' review docs and start rewriting.
PRINCIPLE 01

The Anatomy of a Review: Judgment ≠ Summary The Anatomy of a Performance Review

Andy GroveCamille FournierWriting structure
A perf review is not a list of "what they did this half" — that's their brag doc. Your job is to answer one question: relative to the bar for their level, where are they above / at / below? Then bring evidence. Judgment first, evidence next, direction last. Reversed order turns it into a status meeting.
"Performance appraisal is, in many ways, the single most important form of task-relevant feedback we as supervisors can provide. It is how we assess our subordinates' level of performance and how we deliver that assessment to them. It is also how we allocate the rewards…" — Andy Grove, High Output Management, Ch. 13 "Performance Appraisal: Manager as Judge and Jury"
[OVERALL] A 3–4 sentence core judgment. Above / At / Below + the main reason. Do NOT save this for the end — this is the thesis of the entire review. [IMPACT] The 3 most important things this half. One paragraph each: • What they did (one line) • Why it matters (business / team impact) • What you (the manager) observed about their craft — not their project role [STRENGTHS] 2–3 dimensions where they operate above the bar for their level. Each backed by one specific piece of evidence ("In project X, they did Y") — never "a great team player." [GROWTH AREAS] 2–3 dimensions where they need to close a gap. Each one: observation (specific scene) + impact (why it's a gap) + what you want to see next half. Don't write "needs to improve communication." Write "In design reviews, they often raise directional objections only after a conclusion has been reached, which has caused the team to roll back twice. Next half, surface these at the RFC stage." [NEXT HALF] 1–2 explicit development focuses. Not 5.
Same engineer, same half, two versions of the OVERALL paragraph:
✗ Weak version

"X contributed meaningfully this half. Shipped the payments refactor, contributed to onboarding, ran 3 tech talks. A core senior on the team. Continues to meet expectations of their level."

✓ Strong version

"X performed above the Senior bar this half, but not yet stably enough for Staff. They independently delivered the payments refactor — a project originally scoped at Staff — and Z (a calibrated Staff) reviewed the delivery and called it promo-ready. However, on the Staff dimensions of cross-team alignment and unblocking peers, the evidence is still thin. If next half they can act as technical lead on 1–2 cross-team projects, the promo case is made."

  • Can I state the thesis of this review in one sentence? If not, I haven't thought hard enough yet.
  • Does every strength / growth area come with a specific scene? Delete anything that doesn't — that's impression, not evaluation.
  • Did I use vague frequency adverbs ("sometimes / often / tends to")? Replace with concrete counts or scenes.
  • Are next-half focuses things they can control? Or do they depend on other people / project luck?
  • If they remember only one thing from this review, is it the thing I wanted them to remember?
  • List-style writing. Copying the brag doc into the review template. That's their writing, not your judgment.
  • "Communication" as the universal growth area. It says nothing. Which scene, which mode of communication, with whom.
  • OVERALL written last and written limp. Reader reads three pages of strengths, then a single "meets expectations" — they'll be confused: so why not Exceeds?
  • "Next steps" smuggled in to replace "growth areas." Dressing up weaknesses as "opportunities" is the written form of Ruinous Empathy.
Andy Grove, High Output Management (Ch. 13 "Performance Appraisal: Manager as Judge and Jury") — names the manager explicitly as judge, not coach. Read this chapter to cure any hesitation about "should I really be judging them."
Camille Fournier, The Manager's Path ("The Process of Writing Performance Reviews") — operational guidance on collecting peer feedback and handling 360s.
Phrases for writing reviews

"X consistently operates above the Senior bar in Y, and is approaching the Staff bar in Z." — Anchor the rating to the level rubric, not to "how much they did."

"The pattern I'd like to see shift next half is…" — More precise and behavior-pointed than "they need to improve."

"This is a strong half, but not yet a promotion case, because…" — Say it plainly. No hedging.

PRINCIPLE 02

Calibration: You're Not Reporting — You're Negotiating Calibration: Why Your Story Must Survive the Room

CalibrationLara HoganCross-team fairness
The point of calibration: take every manager's private "my people are great" narrative, put it in one room, and force-rank against each other. Whether your report gets the rating they deserve depends on whether you can tell their story in 90 seconds and hold up under questioning from five other managers. Preparing for calibration is 60% of the review-writing work.
No calibration Manager A's Exceeds Manager B's Exceeds → Promotion depends on your manager's lens → Underrepresented systematically downgraded → Team loses trust With calibration All Exceeds candidates at one table, compared at same level → Shared ruler across teams → Managers must publicly defend the rating → Bias visible to peers → Accountability
"Calibration is where bias goes to die — but only if you walk in armed with specifics. 'She's just really strong' will get steamrolled by another manager who shows up with three concrete cross-team artifacts." — Lara Hogan, Resilient Management
Mid-calibration. The facilitator: "Next up, X — you're proposing Exceeds. Talk us through it." You have about 90 seconds.
✗ Weak version (the room will push back)

"X has been really solid this half. Hit all her OKRs, the team loves her. She's the most reliable senior on my team, I think she's clearly at Exceeds."

→ No anchor to the level rubric, no concrete cross-team evidence, "the team loves her" is unverifiable in this room. The next manager to speak will dismantle this in one sentence.

✓ Strong version

"X for Exceeds, anchored on three Senior rubric criteria. (1) Scope — she independently owned the payments refactor, which was scoped at Staff; I had Z (a calibrated Staff) review the delivery and they called it promo-ready. (2) Multiplier — both mids she mentored were promoted to Senior this half, and her onboarding doc is now being used by two other teams. (3) Cross-team — she has substantive comments adopted in RFCs from infra, billing, and growth. Anchoring her against the other three Exceeds proposals in this room, she's at least stronger on Multiplier."

  • Can I tell this person's story in 60–90 seconds without notes? If not, I'm not ready.
  • Have I read the level rubric? Can I map their evidence to each line of it? (Scope / Impact / Multiplier / Cross-team)
  • Can I name one cross-team peer as an anchor — "same level as Y, but stronger on dimension X"?
  • For each Exceeds I'm proposing, do I have a "I'll concede this weakness, but…" fallback?
  • For each Meets, am I ready to answer both "why not Below" and "why not Exceeds"?
  • Treating calibration as a briefing. It's a negotiation plus collective arbitration. If you don't defend them, no one else will.
  • Over-nominating Exceeds. Everyone Exceeds = you lose credibility, and every future rating from you gets discounted. Grove: "Performance ratings are signals — inflated signals are noise."
  • Only preparing for the "stars" and running naked on your Meets. The Meets ratings are often the ones the room pushes down to Below. You have to hold the line.
  • Walking out without knowing what changed. First thing post-calibration: get the final rating sheet and do a diff against your proposals — who got pushed up, who got pushed down, why. That's your ammunition for next cycle.
Lara Hogan, Resilient Management — concrete scripts for defending your reports in calibration.
Will Larson, An Elegant Puzzle (around "Productivity in the age of hypergrowth") — how calibration prevents rating inflation, and how a manager's credibility compounds.
Calibration-room phrases

"I'd like to anchor her against Y, who's a calibrated Exceeds — on multiplier she's stronger; on scope they're equivalent."

"I hear the concern. Let me concede X, but the case still holds on Y and Z." — Give ground on the small point, hold the core.

"This is a Meets, not a Below — and here's why the room shouldn't push it down." — Proactive defense.

PRINCIPLE 03

Fairness: How Much Bias Are You Smuggling Into Your Ratings? Fairness: The Biases You're Smuggling In

DEIBias mitigationEvidence baseline
"Fairness" isn't an intention — it's a craft. The research is consistent: women, people of color, introverts, remote employees, and people who don't self-promote get systematically downgraded in perf reviews. Not because managers are bad people, but because the brain substitutes "impression" for "evidence." Fair reviews come from process, not from good will.
"In performance reviews, women are 1.4 times more likely than men to receive critical subjective feedback. Men are more likely to receive feedback tied to specific business outcomes — feedback they can act on." — Kieran Snyder, Stanford VMware Women's Leadership Lab, 2014
1. RECENCY BIAS You only remember the last 6 weeks. Antidote: before writing, re-read the full 1:1 doc + project milestones. Camille Fournier suggests "3 lines of running notes per month, the whole cycle." 2. HALO / HORNS One vivid impression colors every dimension. Antidote: rate each dimension independently with its own evidence. Not "she's strong overall → every dimension is Exceeds." 3. SIMILARITY BIAS People who look/sound like you seem "more senior." Antidote: ask — if this person had a different communication style / gender / background, would I give the same rating for the same output? 4. ATTRIBUTION BIAS Men's failure = circumstance; women's failure = ability. Men's success = ability; women's success = luck / team. Antidote: read the review with the name redacted. If swapping the gender would make you want to change a sentence, that sentence was biased. 5. SELF-PROMOTION BIAS People who write good brag docs look more impactful. Antidote: it's your job to see the quiet ones. "Impact they didn't bring up" is the manager's homework, not their failure.
You finished X's (a female senior) review and are about to submit. Before you hit submit —
✗ Original sentence (what you wrote)

"X is sometimes too assertive in design discussions, which can make junior engineers hesitant to push back."

✓ Rewrite (gendered language removed)

"In 3 design reviews this half (Q1 payments, Q2 onboarding redesign, Q2 search migration), X arrived with a strong proposed solution. In 2 of those, junior engineers told me afterward they had alternative ideas they didn't raise. Next half, I'd like to see X open these reviews with the problem framing only, and invite proposals before sharing her own."

→ Same behavioral observation, but (1) specific count and scenes, (2) feedback aimed at behavior, not "character," (3) a concrete actionable direction. The original "too assertive" is precisely the gendered language Snyder's research identifies.

  • Every adjective in this review — can I replace it with "specific scene + specific behavior"?
  • If I swap X's name for a male colleague's, would I change any sentence? If yes, why?
  • Frequency words like "sometimes / often / can be" — do I have specific counts to back them up?
  • Am I confusing "she doesn't self-promote" with "she has less impact"?
  • Is every critique tied to a business outcome rather than a personality trait?
Female Lens · Vigilance in two directions

As a woman leader, you have to watch perf review from two sides:

(1) The reviews you write: Research shows female managers sometimes rate female reports more harshly — sometimes called "Queen Bee," but more accurately: you unconsciously hold "people like you" to a higher bar. After writing any review of a female report, ask yourself: "If they were male, would I have used the same words?"

(2) The reviews you receive: If you see "abrasive / too direct / intimidating / needs to be more collaborative" in your own review, ask your manager for the specific business impact. Sheryl Sandberg in Lean In describes the first time she got this kind of feedback at Google: instead of accepting it as "well-meaning advice," she asked, "Can you tell me which decisions would have been better if I'd been less direct?" That single question changed how she received feedback afterward — and changed the quality of the feedback she got.

Kieran Snyder, "The abrasiveness trap" (Fortune, 2014) — analyzed 248 perf reviews for gendered language. The most-cited piece in this space.
Lara Hogan's blog series "Writing performance reviews" — operational bias-check lists.
Substitution library

• Don't write "abrasive" → write "In meeting X, she interrupted Y twice; this is the pattern I'd like to shift."

• Don't write "not a team player" → write "declined to take on 2 cross-team requests that the team needed."

• Don't write "too quiet" → write "in our design reviews, his strongest ideas surface in 1:1s afterward rather than in the room — I'd like to help him bring them into the room."

PRINCIPLE 04

"Why Not Exceeds": The Conversation You Most Want to Skip The "Why Not Exceeds" Conversation

Hard conversationMeets ≠ failureKim Scott
Most people are Meets. Most of them will ask "why not Exceeds?" If your answer is "you did great, but the quota was tight," you've lied, broken their trust in the system, and shoved them into a half-year of disengaged drift. The honest answer: Meets isn't failure, and Exceeds has a specific bar — here's what the bar is, and here's where you fell short.
"When you withhold honest feedback to be 'kind,' you're actually being unkind — you're robbing the person of the information they need to grow. The kindest thing is to tell them exactly what would have made the difference." — Kim Scott, Radical Candor
X. You rated them Meets. Thirty minutes into the review conversation, they ask: "Can I ask why this isn't Exceeds? I feel like I did a lot this half."
✗ Trap 1: blame the system

"Honestly, you did great — the company caps Exceeds at 15% this year, and calibration was brutal. You were close, you'll get it next time."

→ Shifts responsibility to "the system." They conclude "it's not me, it's luck." They change nothing next half.

✗ Trap 2: vague encouragement

"You're solid at Senior. A little more visibility, a little more ownership, and you're there. Keep it up!"

→ "Visibility / ownership" is empty. They walk out with no idea what to actually do — but with a warm fuzzy feeling. The textbook version of Ruinous Empathy.

✓ Strong version: validate, then break down the bar

"Good question. First — Meets is not me being unhappy with the half. You delivered payments refactor and two core onboarding modules; the quality was solid.

Exceeds at this level has two specific bars: (1) Scope expansion — independently owning an ambiguous problem originally scoped at Staff; (2) Multiplier — measurably raising the output of others on the team. On the first one, you got halfway: the payments refactor was Staff-sized, but the scoping was done by Y; you took over execution. On the second, I saw signals but not yet a stable pattern — you mentored Z, but your design doc didn't get adopted outside our team, so it didn't produce leverage.

So this isn't 'a hair short' — there are two concrete things that didn't fully happen. Next half, if you can own the search project end to end, starting from the problem framing, and write up your caching pattern so the platform team adopts it, the Exceeds case is made. I'll put both of those in your next-half focus."

  • Can I name in one sentence where they fell short? Not "almost there" — which specific dimension.
  • Is that gap controllable by them next half? (If not, action is impossible and the conversation is pointless.)
  • If they didn't ask, would I bring it up unprompted? (If not, my answer doesn't actually hold up.)
  • Am I prepared for them to cry, be upset, or push back?
  • If they say "well, why did Y get Exceeds?" — am I ready? (Don't discuss someone else's rating, but you can discuss the rubric.)
  • Treating ratings as prizes. Meets is not a consolation prize. In a healthy system it's where most high-performing people sit. If your tone makes Meets sound like failure, you're feeding rating inflation.
  • Spending too little time on the rating itself. Many managers spend 5 minutes on the rating in a 60-minute review. The report walks out confused. Rule: rating rationale + next-half direction should take at least 25 minutes.
  • Not restating the main points. At the end of the conversation: "The three things I want you to leave with today are X / Y / Z." Otherwise, a week later they remember the emotion, not the content.
  • No written follow-up. A verbal conversation + a 2-paragraph recap email within 24 hours = they remember it 5x more accurately.
Female Lens · When you receive Meets

As a woman leader, when you get a Meets and expected an Exceeds, research suggests you're more likely than male peers to: (a) accept it immediately, blame yourself, "I should have worked harder"; (b) swallow the "why not Exceeds" question because asking feels "ungrateful"; (c) quietly burn out instead of asking for more scope or resources.

The right move is Sandberg's: "Can you walk me through what would have made this an Exceeds — specifically, which scope or which artifact?" That isn't ungrateful, it's professional. If you don't ask, next half you'll get the same fuzzy feedback and the same Meets — and by year three you'll think you've hit your ceiling.

Conversely, when you're the manager giving a female report Meets: proactively open the "why not Exceeds" door. Don't wait for them to ask — many of them won't.

Kim Scott, Radical Candor ("Get Stuff Done Wheel" chapter) — why vague encouragement is the least kind version.
Ben Horowitz, The Hard Thing About Hard Things (the feedback philosophy around "Lead Bullets") — vague positive feedback is the manager being comfortable at the report's expense.
Useful phrases for this conversation

"Meets is not a consolation prize — let me first say what made this a strong half." — Open. Set the tone.

"The two specific things that would have made this Exceeds are…" — Cut to the core. No hedging.

"This is concretely actionable. Let's put it in your next-half focus." — Close. Pivot to the future.

• Avoid "You were close" / "Next time" / "It was competitive" — these three are the throwaway phrases of perf review.

This Week's Practice · Your Day 3 Action

If you're writing or just finished perf reviews this week — do this before you submit:

(1) Pick the next review you're about to submit and pull out the OVERALL paragraph. Can you state your thesis on this person in one sentence (above / at / below + main reason)? If not, you haven't thought it through — stop and finish thinking.

(2) The name-and-gender redaction test: replace the name and pronouns with the opposite gender, then re-read. If you'd want to change any sentence — that sentence was biased. Fix it, then submit.

(3) Rehearse "why not Exceeds" in advance: for every report you rated Meets, write out your 60-second answer (specific bar + specific gap + concrete next-half action). If you can't write it, your Meets rating doesn't have the reasoning to back it up. Go back and shore it up.

(4) The day before calibration: speak the 90-second case for every Exceeds / Below proposal out loud — to a mirror or a voice memo. If you can speak it, you can defend it. If you can't, the rating won't survive the room either.

Honestly: you won't have time to do all of these for every report this week. But even if you do this process for the 1–2 most important reports, their trust in you will change within the next 6 months. They'll learn you're "the kind of manager who really evaluates me" rather than "the kind of manager who treats review as a process."