Mental Models: Writing & Thinking on the Page

Writing Is Thinking

Writing isn't transcribing finished thoughts — it's the process of getting clear itself

In Depth

The deepest misconception is treating writing as the "output of thought" — as if a complete idea already exists in your head and writing merely copies it down. The truth is the reverse: before you write it down, you feel you understand, and that's just the illusion of explanatory depth — it feels clear until you touch pen to paper and find holes everywhere. Writing breaks that illusion because it forces serialization: thought in the head is parallel, associative, graph-shaped, while a sentence is a single line. Flattening a graph into a line forces you to answer the questions your mind glosses over — what comes first, what causes what, where does this step break?

Non-trivial: (1) the page is an external working memory that doesn't decay. Your head holds only three or four chunks at once, so any complex argument overflows; writing externalizes the intermediate results, letting you handle far more complexity than your skull can. (2) Writing is a compiler for thought — wherever intuition is vague, the sentence won't compile, forcing you to supply the missing definition on the spot. (3) Structurally isomorphic to Buddhist contemplation: insight meditation (vipassanā) makes automatic thoughts "observable"; writing lands the fuzzy thought on paper and turns it into an object you can examine, opening a gap between you and your thought. (4) Compared to distributed systems: thinking only in the head is "eventually consistent, with no log" — you can't even replay it yourself; writing is an append-only log you can re-read, audit, and use to locate which step went wrong.

Practice: write to discover, not to record. The first draft is "thinking on paper" and is supposed to be ugly — split "generate" from "edit," and don't let your inner editor strangle exploration before the thought has formed.

Flattening graph-shaped thought into a line exposes the gaps you thought you'd understood

Classic example

Late in his life, the physicist Richard Feynman watched a historian call his desk full of drafts a "record" of his thinking. Feynman corrected him on the spot: this isn't a record — what happens on the paper is the thinking itself; I actually did that work on the paper. That line captures the essence: the equations and crossings-out aren't traces left afterward, they're the scene where cognition happens. Likewise, "if you can't write it clearly, you don't yet understand it" — being unable to explain is rarely an expression problem; it's an understanding that hasn't formed.

BigCat scenario

(1) Engineering: write the design doc before building the AI agent system — the doc isn't a record of a "finished design," it's where the design reveals itself to be broken as you write it out: when you're forced to write the call sequence across multiple agents step by step, the race condition your mind kept skipping finally appears. (2) Parenting: before explaining a concept to a school-age child, force yourself to write it in three sentences — you'll find precisely the shakiest part of your own understanding, which is usually exactly where the child will get stuck. If you can't write it, you don't really know it.

AI Prompt

English Prompt

Here are my rough, unsorted thoughts on [topic/decision]: [paste your draft/notes]. Treat this as me "thinking on paper" and use writing to break my illusion of understanding: 1. Point out 2–3 places I think I explained but where the logic actually breaks or a term is undefined. 2. For each, pose one sharp question I must answer before I can think it through. 3. Don't polish it into clean prose — the goal is to expose what I haven't actually figured out.

One Idea at a Time

One idea per sentence, one point per paragraph — the reader's working memory is the bottleneck

In Depth

The core constraint lives not in the writer but in the reader: human working memory holds only three or four chunks at once. A sentence stuffed with three ideas forces the reader to parse, buffer, and relate three things simultaneously — overload, then reread. One reread and reading throughput collapses. So "one idea at a time" isn't a matter of rhetorical taste; it's the physical discipline of respecting the reader's cognitive bandwidth.

Non-trivial: (1) complexity is conserved. The tangle you didn't untie for the reader while writing doesn't vanish — it's handed off, intact, to every reader, multiplied by the readership. Ten extra minutes spent untangling saves a thousand readers the decoding effort each. This is the trade-off engineers know: compress at write-time, or push the decompression cost onto every read. (2) The idea boundary is the unit of revision — a sentence that resists every attempt to split it usually signals that the underlying thought is still tangled (see Card 1: a muddy sentence is the shadow of muddy thinking). (3) Isomorphic to a single-threaded event loop: throughput comes from slicing tasks small and letting them flow through, not from one giant blocking call — one idea per sentence feeds the reader's brain small tasks. (4) Sentences carry an order contract too: start each with old information (an anchor) and end with the new, so new knowledge hooks onto the old knowledge just established and the chain never breaks.

Practice: one sentence = one subject doing one thing. When you find yourself stacking "and... which... because...," stop and split. Test: can the reader restate the sentence's point in one short clause? If not, you haven't split it enough.

Classic example

The scientific abstract is this discipline's ultimate training ground: within a strict word count, each sentence advances exactly one step — one for background, one for the gap, one for method, one for results, one for significance. A reader (the reviewer) can reconstruct the whole logic in thirty seconds, not because the sentences are ornate but because each lays one brick, and each brick rests on the one before. Delete any sentence and the chain loses a link — that's the combined effect of "one idea per sentence" plus "open with old information."

BigCat scenario

(1) Engineering: in code review, one logical change per commit, one claim per sentence. A 200-line commit that does three things is the prose equivalent of a long sentence crammed with three ideas — the reviewer is forced to parse in parallel and stalls. Splitting it into three small commits is splitting the long sentence into short ones. (2) Parenting: give a school-age child "put away the toys, then wash your hands, then come eat" all at once and their working memory can't hold three steps — they'll likely do only the last. One thing at a time, finish it, then the next — exactly the same logic as writing. What you don't slice small for the reader/listener, they pay for in overload.

AI Prompt

English Prompt

Here is a passage I wrote: [paste paragraph]. Audit it for reader cognitive bandwidth, using "one idea at a time": 1. Flag every sentence carrying 2+ ideas that would force a reread. 2. Split each into "one idea per sentence" clauses, and where possible start each with old info and end with new. 3. Name the single point this paragraph is meant to prove — if it's actually proving two, suggest how to split it into two paragraphs.

Active Voice

"Who did what" — active voice forces you to name names; passive voice is best at hiding them

In Depth

Active voice has the structure "actor → action → object," which aligns with the causal structure of reality and with the way the mind natively models events as "who did what." Passive voice hides the actor — "mistakes were made" — grammatically complete, yet the "who" that should be responsible has vanished into thin air. Passive is sometimes used precisely to hide the agent, and that is exactly why it deserves suspicion.

Non-trivial: (1) voice is not just style but an ontological choice: active forces you to fill in both the "who" and the "what," so hidden assumptions and missing responsible parties get pushed into the open. (2) In a design doc, passive is where bugs hide — "the data is processed" omits which service, when, and with what guarantee; passive voice is a dangling edge in the causality graph, while active forces you to connect it to a concrete node. (3) Active is usually shorter and faster to read: passive adds scaffolding ("was ... by"), one more layer for the reader to decode each time. (4) But keep proportion: when the object is genuinely the topic, or the actor is unknown or irrelevant, passive is correct — "the sample was heated to 80°C," where attention belongs on the sample, not on whose hand moved. The rule is "default active, deliberate passive," not "ban passive."

Practice: scan the draft for every "was/were ... -ed" and "by," and at each ask: who is the actor of this action — am I hiding them on purpose, or just being lazy? If there's no reason to hide, switch to active and put the "who" back into the sentence.

Classic example

"Mistakes were made" is a specimen of political language in the English-speaking world — it admits an error while leaving the person who erred mysteriously absent. George Orwell, in "Politics and the English Language," long ago named the move: vague passive voice plus abstract nouns is the standard instrument political language uses to obscure responsibility and numb judgment. Here voice isn't a question of flair but of honesty — active voice forces out the subject that was deliberately left off.

BigCat scenario

(1) Engineering: in a postmortem, "the database was overloaded" and "the retry logic hammered the database until it overloaded" are two different documents — the former has nothing to fix, the latter points straight at the actor, and therefore at the fix. A postmortem in passive voice reads as if no one was at fault, so no one will change anything. (2) Parenting: model honest agency. "The vase got broken" erases the child's (and your own) hand from the sentence; stating "who did what" plainly is how you teach responsibility. Active voice is "skin in the game" at the level of prose — keep the name in the sentence.

AI Prompt

English Prompt

Here is something I wrote (e.g., a design doc / postmortem / announcement): [paste text]. Run a "voice and accountability" audit: 1. Find every passive sentence and name the agent it hides. 2. For each, judge whether hiding the actor is deliberate (the object is genuinely the topic) or just lazy — rewrite the lazy ones as active. 3. Flag sentences that make "no one responsible," and rewrite them as active sentences that name the actor and point to a fixable target.

Subtraction Beats Addition

"Omit needless words" — good writing improves mostly by cutting, not by adding

In Depth

The default instinct for improving anything is to add: add a paragraph of explanation, add a qualifier, add a transition. Research shows again and again that, faced with something to improve, people overwhelmingly add and barely think to subtract. Yet writing improves mainly by cutting. Every word spends the reader's attention budget, so a word that doesn't earn its keep is a net tax levied on every reader.

Non-trivial: (1) subtraction is harder than addition for two reasons. First, cutting is invisible labor — adding produces a new artifact to show, while deleting leaves no visible sign you did anything. Second, loss aversion — you're killing words you birthed yourself ("kill your darlings," cut the sentence you're proudest of). (2) The deepest cut isn't words but ideas: the bravest stroke deletes the entire paragraph you're proudest of because it doesn't serve the reader's main line. The criterion is always the reader's path, never your reluctance. (3) Subtraction compounds: removing the weakest link both raises the average and removes a failure mode — the same logic shared by via negativa, YAGNI, and Occam's razor: getting stronger by removing is steadier than getting stronger by adding. (4) In the AI era this model's weight is rising: generation is now infinitely cheap, so the scarce skill shifts from "being able to write it" to "daring and knowing how to cut" — trimming the model's sprawl down to bone.

Practice: write long, then cut 30%. For each sentence ask: if I delete this, does the reader lose anything they genuinely need? No — cut it. Schedule "deletion" as a separate pass, and read your own draft with a subtractive eye.

Classic example

The writing bible The Elements of Style compresses the whole book into one iron rule: "Omit needless words." Vigorous writing is concise; a sentence should contain no unnecessary words, as a drawing should have no unnecessary lines. And the much-quoted line: "I would have written a shorter letter, but I didn't have the time" — writing short costs more effort than writing long, because short means you've already done all the hard cutting on the reader's behalf. Deleting is never laziness; it's higher-order labor.

BigCat scenario

(1) Engineering: the best PRs are often net-negative lines — deleting a feature or a config option removes, in one stroke, its maintenance cost and a whole surface of bugs. "Today I made the codebase shorter" is often worth more than "I added a feature." (2) AI workflow: you have the model draft, but what really decides quality is the cutting blade you apply afterward — trimming three pages to half a page, where the surviving density is your judgment. (3) Communication: say less, and the unsaid carries its own weight. For prose, for code, for speech alike, subtraction is the badly underrated half of the craft.

AI Prompt

English Prompt

Here is something I wrote: [paste text]. Put on a "deletion mindset" and cut it by ~30% without losing anything the reader genuinely needs: 1. Mark deletable filler, clichés, and empty transitions sentence by sentence, and show the trimmed version. 2. Identify 1–2 passages I'm probably proudest of that don't serve the main line — recommend cutting them and explain why the piece is stronger without them. 3. Return the tightened full text and tell me how many words it dropped.