Mental Models: Engineering Philosophy & Software Design

Levels of Abstraction

"Computer science is the science of indirection." — Butler Lampson

In Depth

Abstraction is not "hiding details." It's picking the set of invariants sufficient for the current problem and encapsulating everything else one layer down. Good abstraction: the upper layer can reason correctly using only the interface, and the implementation below can be swapped independently. Bad abstraction: the upper layer must know the lower-layer details to use it correctly ("leaky abstraction").

Non-trivial: (1) there's a correct number of layers — too few and the upper layer repeats work, too many and a single line of business logic must pierce 7 layers of indirection (an "abstraction tax"). (2) Every new layer compresses information, and the compressed-away part eventually returns as a bug — the cost of an abstraction is "when does the lost information bite back?" (3) Structurally isomorphic to Buddhist prajñapti (useful designation): an abstraction is a useful fiction, not the thing itself. Believing TCP/IP "really has layers" is a category error — layers are our design, not a physical fact of the network.

Practical test: a healthy abstraction lets the upper layer be debugged without opening the lower layer. If every debug session forces you to dive through 3 layers, you've drawn the boundary wrong. Good abstraction = linear reasoning; bad abstraction = tree-shaped reasoning every time.

Classic example

TCP gives the upper layer the illusion of a "reliable byte stream" — applications don't need to know IP packetization, retransmission, or congestion control. This is why the Internet scales: each new layer (IP → TCP → HTTP → REST → GraphQL) lets developers forget an entire class of problems. The price: when the abstraction leaks (e.g., head-of-line blocking in HTTP/2), the upper layer must temporarily become "half a TCP expert" to tune.

BigCat scenario

Designing an AI agent system: treating an LLM as a "natural-language function" is one abstraction. Benefit: orchestrate via prompts. Cost: when the LLM hallucinates or token cost explodes, you must drop down to "it's actually probabilistic sampling + context window" to debug. Choosing the wrong abstraction layer is the dominant source of design cost. Same with parenting — "learning" is one abstraction; when a child gets stuck, you must be able to drop to lower-level concepts (attention / working memory / motivation), otherwise no intervention lands.

AI Prompt

English Prompt

I'm designing [system/module]. Current abstraction boundaries: [describe]. Please: 1. Identify 2–3 likely leaky abstractions — interfaces where the upper layer must know lower-layer details to use them correctly. 2. Diagnose whether there are too few layers (upper-layer duplication) or too many (debug requires piercing many layers). 3. Propose one concrete re-drawing of boundaries, naming which compression-rebound it resolves.

Conway's Law

"Organizations design systems that mirror their communication structure." — Melvin Conway, 1967

In Depth

System architecture is an isomorphic projection of the organization's communication structure. High-friction team boundaries always evolve into "defensive interfaces" in the code they own — no matter how clean the original architecture diagram. This isn't a bug; it's the physics of socio-technical systems.

Non-trivial: (1) the Inverse Conway Maneuver — to get the architecture you want, reshape the org first. Want microservices? You first need small, autonomous teams with full ownership. Splitting services without splitting the org = you'll ship distributed monoliths. (2) Same family as emergence: architecture isn't designed, it emerges under bandwidth constraints. What an architect actually controls is boundary conditions (team splits, reporting lines, who shares meetings) — not the box diagram directly. (3) A non-trivial corollary for the AI era: a single AI-augmented developer = the team collapsed to 1 → Conway predicts their code will have far fewer defensive boundaries, will be more tightly integrated — and correspondingly harder for others to take over.

Practice: before designing architecture, draw the communication graph — who talks to whom daily, who reports to whom? That graph predicts what the code looks like in 6 months more accurately than any UML.

Conway's Law: architecture = the org's communication graph, projected into code

Classic example

Amazon's "two-pizza team" + "API-only mandate" (Bezos, 2002) is the canonical Inverse Conway. Bezos didn't say "let's build microservices." He said "all inter-team communication must go through service APIs." The communication structure was forcibly reshaped → six years later, AWS emerged as an externalizable product. Architecture isn't drawn — it's forced by org constraints.

BigCat scenario

(1) At work: two modules in a distributed system constantly fight at their boundary? Look at the people — do those two teams only talk in meetings? Tightening daily communication is more curative than refactoring the code. (2) At home: assigning the "piano practice" and "homework" modules to the same parent → boundaries blur automatically; splitting them across parents + syncing via a shared calendar → an "interface contract" emerges. Conway's Law doesn't care about company vs household. (3) AI era: you + a set of agents form a new "organization" — the bandwidth between you and your agents determines coupling in the artifacts you produce together. Narrow bandwidth = fragmented agent output.

AI Prompt

English Prompt

Our team org structure: [describe]. Target architecture: [describe]. Please: 1. Apply Conway's Law: under the current org, what architecture will actually emerge in 6 months? 2. Identify 2 team boundaries most likely to harden into defensive APIs. 3. Propose one minimal Inverse-Conway move (which communication line to redraw, who joins which meeting) that makes the target architecture feasible.

Technical Debt

Ward Cunningham, 1992 — not all debt is bad; what matters is the interest rate and the intent to repay

In Depth

The original metaphor is widely abused. "Hacky code = tech debt" is wrong. Cunningham's intent: ship with incomplete understanding, then refactor in what you learned post-release — that's the borrow. Interest = every new feature built on a flawed model compounds the future refactor cost.

Non-trivial: (1) conscious debt vs unconscious debt — the former is strategy (chase a deadline, validate PMF first); the latter is ignorance (the team didn't know a better way existed). Unconscious debt is the most dangerous: you don't know you're indebted. (2) Interest rates are not uniform — some debt sits dormant for 5 years (dead code, inconsistent style); some grinds velocity to zero in 6 months (wrong data model, wrong core abstraction, wrong deployment topology). Pay down high-interest debt, not "the ugliest-looking debt." (3) Every abstraction choice is a borrow — zero debt = zero output. A healthy team explicitly tracks what it borrowed, the rate, and when it plans to repay. (4) Structurally isomorphic to the Buddhist notion of karma: each commit plants a fruit the codebase must eventually face — but you can notice and actively repay, rather than passively suffer.

Practice: every quarter, maintain a "tech-debt ledger" — 3 columns: what you owe / estimated rate / planned repayment month. Debt not on the ledger = high-rate debt.

Classic example

Twitter's early Rails monolith was the correct debt choice — get PMF validated fast. But during the 2010 World Cup, when the site kept melting, the interest rate had become "exponential in user growth." It took three years to rewrite into Scala/JVM microservices (the "Fail Whale" era). Lesson: they borrowed right early (it kept Twitter alive); they paid back too late (waited until crisis forced it).

BigCat scenario

(1) Personal knowledge system debt: taking notes without categorizing = borrowing on "classification decisions." Low rate for 3 months (few notes, search works); high rate at 1 year (notes explode, nothing is findable). (2) "Parenting tech debt" is real — making decisions for a child to save time = borrowing against her future autonomy; the interest comes due in one lump sum at adolescence. (3) AI workflow: copy-pasting prompts = borrowing on "prompt systematization." Speedy upfront; six months later, none of your prompts are reusable, none version-controlled — you're paying high interest on last year's laziness. The point isn't to never borrow; it's to keep the books.

AI Prompt

English Prompt

My project/system: [describe]. Run a technical-debt audit: 1. List 5 current debts, ranked by estimated interest rate × current balance. 2. Tag each as "consciously borrowed" or "unconsciously incurred" — prioritize the latter. 3. Propose a 1-quarter repayment plan: which 2 to pay down, why these two, how much future velocity it buys back.

YAGNI & Premature Abstraction

"You Ain't Gonna Need It" — XP / "Duplication is far cheaper than the wrong abstraction." — Sandi Metz

In Depth

YAGNI is the discipline against "future-tense engineering": don't write code for imagined requirements. But the deep version isn't "write less code" — it's that abstraction needs data. You must see 3–5 real variants before the correct boundary is visible. Abstracting at variant #1 or #2 = fitting on insufficient samples; you'll inevitably overfit to the current two examples, and the abstraction will collapse the moment variant #3 arrives.

Non-trivial: (1) "Looks duplicated" ≠ "is conceptually duplicated." Two identical-looking blocks may be two instances of one concept (DRY them) or two distinct concepts that happen to look similar today (don't DRY). Premature DRY = locking two concepts into one abstraction; later, as they evolve independently, you'll painfully tear them apart. (2) Sandi Metz: "duplication is far cheaper than the wrong abstraction." Duplication is a linear problem (change in N places); a wrong abstraction is a coupling problem (every new requirement gets contorted by the wrong frame). (3) Reverse trap ("YAGNI overshoot"): claiming "I won't predict the future" about a wrong core data-model choice is a mistake. Architecture-level decisions (data model, deployment topology, core API contracts) must be predicted, because they carry the highest interest rate when wrong (see Card 3). YAGNI applies to the implementation layer, not the architecture layer. (4) "Wait for three before abstracting" is Martin Fowler's Rule of Three heuristic — tolerate duplication twice, abstract on the third occurrence when samples are sufficient.

Practice: as you write, ask "is the basis for this abstraction real observed variants or imagined variants?" The latter = YAGNI trigger. Stop.

Classic example

Knight Capital, 2012 — a dormant code path ("Power Peg") kept around for 8 years "in case it's needed." A deployment slip-up re-activated it; in 45 minutes the trading system fired runaway orders and lost $440M. The firm went under the next day. YAGNI isn't aesthetic hygiene: every line of code you "don't think will run" is a time bomb. Keeping "future-maybe" code in the repo = leaving an unexploded device for the future to trigger.

BigCat scenario

(1) Engineering: building an AI agent framework, you see 2 agents both have "retry + logging" — extract BaseAgent immediately? Stop. Wait for agent #3, then look at the shared structure of three. You'll find that the first two "retries" are semantically different (one is API rate-limit, the other is hallucination-resampling); forcing them together only entangles two different retries in one wrong interface. (2) Parenting: the first time your child says "I don't want to practice piano," constructing a whole "how to make her persist" system = premature abstraction (sample = 1). Observe 3–5 occurrences first to distinguish "tired today" / "teacher problem" / "genuinely dislikes it" — three root causes, three strategies; abstracting too early = one hammer, all nails. (3) Knowledge base: after reading 5 neuroscience papers, building "my neuroscience framework" = premature abstraction; the framework will almost certainly be torn down by paper #10. Draw the map at 30–50 papers. "Duplicate three times before abstracting" is the engineering counterpart of regularization against overfit.

AI Prompt

English Prompt

I'm about to add an abstraction / generic framework to [code/system/process]. Stress-test me: 1. How many real observed variants do I have? If < 3 → YAGNI trigger; argue me into duplicating first. 2. If I abstract now, which feature am I most likely overfitting to, and how does it break when variant #3 arrives? 3. Classify the decision as implementation-layer vs architecture-layer. If architecture-layer, name the minimal set of decisions that must be made now.