Day 8 · 2026.05.26

Writing: Technical WritingDocs that Outlive their Authors

BigCat's Writing

A senior engineer's leverage lives less in code than in the writing that survives it — RFCs, ADRs, API docs, the occasional well-placed comment. None of these are written for today's reviewer; they're written for a stranger six months out (often a version of yourself who has lost the context). This week's four principles — design docs, Diátaxis, ADRs, and the philosophy of comments — turn "I once thought this through" into "I left a trace of the thinking." The investment most engineers skip.

Principle 01

Design Docs: Written for the You of Six Months From Now

Design Docs & RFCs — Writing as Thinking, Not Archiving
RFC · Design Doc
Principle + Master's Words

The first reader of a design doc isn't the reviewer — it's you in six months. The doc preserves your judgment at the time, the options you rejected, the "why not" — not a performance of how clever you were at submission.

"The single most important point of a design doc is to make the author reason about the design at the right level of abstraction." — Malte Ubl《Design Docs at Google》(industrialempathy.com, 2020)

The value is not in the artifact; the value is in forcing the author to reason at the right altitude — not at "what code will we write," not at "what shall we name the company strategy," but at "what tradeoff are we making, and why."

Why it works

Engineers often think "doc = record of decisions already made." Wrong. The real value of a design doc is in the writing itself: vague intuitions become sentences, and half the flaws fall out on the way to the keyboard. This is the engineering version of "writing is thinking."
And the future reader (you, or a new hire) won't ask "what is this" — they'll ask "why didn't we pick B?" That's why Alternatives Considered is the most long-tail-valuable section of the whole doc: it preserves the rejected options and the reasoning, so the next engineer doesn't re-walk a pothole you already mapped.

Context & Goals
Where the problem is; what counts as winning
Non-goals
Explicit "not solving" — the shield against scope creep
Design
A paragraph or two + a diagram. Intent, not code
Alternatives Considered
≥ 3 options, each with "why not" — the section with the longest tail of value
Risks & Open Questions
Known risks + what's still unresolved (named honestly)
Rollout & Rollback
How to ship; how to retreat if it goes wrong
Six-section skeleton. Full ≠ good — but every missing section is where future "strange coincidences" originate.
Revision examples
We will use Kafka for the event pipeline because it is scalable and reliable. (Empty — "scalable / reliable" is true of every candidate, so it says nothing.) Chose Kafka over Pulsar after benchmarking both at our 10K msg/s peak (3 days each). Pulsar's broker-bookie split adds two ops surfaces we don't have headcount for. Tradeoff accepted: weaker geo-replication — fine through Q3 because all consumers are us-east. Revisit when EU goes live.
This design adopts a microservice architecture to improve system scalability. (Same empty pattern — "improves scalability" tells the reader nothing about why this option, this tradeoff.) Between "monolith + modular packages" and "three microservices," we chose the latter. The deciding factor: paid-tier and free-tier release cadence have diverged 3x over the last 8 weeks (data: 12 vs 4 deploys); shared-repo merge conflicts are the bottleneck. Rejected third option, "event-sourced rewrite": no one on the team has shipped event sourcing in two years; the cost of a new paradigm outweighs the benefit.
When to use · Common errors
  • ✓ Any decision ≥ 2 weeks / 1 person-month · cross-team interface change · architectural baseline migration
  • ✗ Skip for: single-file refactors · product micro-copy tweaks — a PR description suffices
  • Error 1: Writing only after the decision is made — the doc should precede the call, not archive it
  • Error 2: Listing alternatives without the "why not" — losing all long-tail value
  • Error 3: Pseudocode or class diagrams in place of design — the reader cannot read intent or tradeoff
  • Error 4: No Non-goals — reviewers keep asking "what about X" that was never in scope
Key references

Malte Ubl《Design Docs at Google》industrialempathy.com 2020 · Will Larson《An Elegant Puzzle》Ch. 4 — the leverage of decisions and memos in engineering orgs · Gergely Orosz"How Big Tech Runs Tech Projects" — comparative RFC / TDD practices

This Week's Exercise + Reflection

Exercise: Pick your next RFC. Cut "Implementation Details" by half; expand "Alternatives Considered" to twice its current size. Have a cross-team peer read it first — every "why not X" they ask is an X that belongs in Alternatives.
Reflect: As AI authorship of RFCs rises, who is doing the "reasoning about the design"? Does AI-drafted prose let the author skip the thinking step? What human–AI split preserves the "writing is thinking" return?

Principle 02

API Docs: Four Functions, Four Pages

API Docs & Diátaxis — Four Modes That Cannot Share a Page
API · Diátaxis
Principle + Master's Words

One page cannot simultaneously teach beginners, serve as reference, walk through tasks, and explain design intent. Separate the four — tutorials, how-to guides, reference, explanation — and let each reader find their lane.

"Documentation needs to include and be structured around its four different functions: tutorials, how-to guides, technical reference and explanation. Each of them requires a distinct mode of writing." — Daniele Procida《Diátaxis》(diataxis.fr · PyCon AU 2017)

From "What nobody tells you about documentation" — the talk that became the framework most modern doc sites (Django, NumPy, GitLab, Cloudflare) eventually adopted.

Why it works

What happens without the split? Beginners stare at the reference; experts skim the tutorial for the field they need; SREs hunting how-to-guides drown in design philosophy. Everyone is annoyed. Diátaxis maps docs across two axes — vertical "study vs work," horizontal "practical vs cognitive." Four quadrants, four jobs.

Practical / Action
Cognitive / Theory
Study
Tutorials
learning-oriented
"Follow me." One happy path, 10 minutes to a working Hello World. Don't explain why.
Explanation
understanding-oriented
"Why is it this way?" Background, tradeoffs, history. For the reader who's up and running and now curious.
Work
How-to guides
problem-oriented
"How do I handle webhook retries?" Recipes for known problems. Quick, specific.
Reference
information-oriented
Full API tables, fields, types, errors. The kind of doc a machine could generate. Cold, complete, storyless.
Four quadrants. Stripe's docs are gold standard because they enforce the split rigorously — and link generously across.
Revision examples
Single "Getting Started" page — 300 lines mixing install, quickstart, full API table, and design philosophy. New users stall on the API table; experts dig through stories to find a field. Split into four: ① Tutorial ("10 minutes to your first charge," single happy path) · ② How-to ("Handle webhook retries," "Test refunds in sandbox") · ③ Reference (full API + fields, auto-generated) · ④ Explanation ("Why idempotency keys look like this"). The landing page is now a router, not a textbook.
A single markdown blending function signature, usage demo, caveats, and design philosophy. The reader cannot tell where one ends and the next begins. Signature & demo go to Reference. Caveats go to How-to ("Handling rate-limits"). Design philosophy goes to Explanation. The Reference page links to both — scanners stay; "why" readers click through.
When to use · Common errors
  • ✓ Public APIs · SDKs · internal developer platforms · CLI tools
  • ✓ Doc-site refactors — Diátaxis is information architecture, not a styling decision
  • ✗ Skip for: single-file libraries (one README is enough) · one-off scripts
  • Error 1: Cramming all four into the README — the README is a foyer, not the full collection
  • Error 2: Tutorials that read like reference — stacking API tables instead of walking the happy path
  • Error 3: Reference that reads like a tutorial — story instead of field definitions
  • Error 4: Putting Explanation on the front page — most arriving readers want how, not why; Explanation is for those who are up and running
Key references

Daniele Procida《Diátaxis: A systematic framework for technical documentation》diataxis.fr · Stripe API Docs docs.stripe.com — textbook Diátaxis in production · Google《Technical Writing Courses》developers.google.com/tech-writing — free two-stage course

This Week's Exercise + Reflection

Exercise: Pick a doc you've written and tag each section against the four Diátaxis modes. If any mode is at zero, ask: is it truly unnecessary, or did you skip it by default? Missing tutorials or explanation is often why a doc "looks complete but feels stuck" in use.
Reflect: When LLMs become the "intermediate reader" for most APIs (developers querying your API through ChatGPT), should docs be written for humans or for LLMs? Which of the four quadrants do LLMs handle worst — tutorials? explanation?

Principle 03

ADRs: Time-Stamp the Decision

Architecture Decision Records — Immutable Memos for the Future
ADR · Decision Record
Principle + Master's Words

A decision = context + choice + consequences. Drop any one and six months later it reads as a "strange coincidence." An ADR time-stamps the decision — once written, never edited; to change your mind, write a new ADR marked "Supersedes ADR-042."

"We will keep a collection of records for 'architecturally significant' decisions: those that affect the structure, non-functional characteristics, dependencies, interfaces, or construction techniques." — Michael Nygard《Documenting Architecture Decisions》(Cognitect blog, 2011)

Three pages of blog that changed how a generation of engineering organizations track decisions. ThoughtWorks moved "Lightweight ADRs" to Adopt on its Tech Radar in 2018.

Why it works

Design docs are about design; ADRs are about decisions. They differ. A design doc may evaluate three options; an ADR is the gavel: "We picked A."
What matters in an ADR isn't the format (five fields, that's it) — it's the immutability stance: once written, never edited. This is git-commit thinking, not wiki thinking. If you change your mind, write a new ADR with "Supersedes ADR-042"; the original stays. History isn't rewritten, and learning compounds.

Title
ADR-042: Migrate to Postgres 15
Status
Accepted · 2026-05-10 · authors: bc, jh · superseded by ADR-067 (2027-03)
Context
PG12 reaches EOL 2024-11. Security patches end. PG16 was released 2023-09 (8 months old, low adoption). PG15 has been GA > 2 years and is mature. Our org is conservative on database majors.
Decision
Migrate production and replicas to PG15 (not 16). Two-week window over a Q3 weekend.
Consequences
+ JSONB performance gain · + EOL exit before EU regulatory review · − must rewrite 12 JSONpath queries to PG15-compatible form · − lose access to PG16's improved logical replication; revisit in 12 months.
Revision examples
PR description: "Switch to Postgres 15." (Eighteen months later a new engineer reads the repo: "Why PG15 and not PG16?" — no one can answer.) ADR-042 with the five fields above. The doc lives at docs/adr/0042-postgres-15.md. Never edited. Eighteen months later the new engineer reads the ADR: decision and context obvious — "they weren't dumb; they were risk-averse."
Wiki page "Database choice" — edited 12 times by 5 people over six months. No one can tell why PG was chosen. Every new hire reopens the same argument. Same decision migrated to ADR-042 plus immutable git history. The "why PG" question closes permanently — link to the ADR, 5 minutes to read. New hires who disagree write ADR-067 proposing an alternative with counter-evidence. Progress moves along the ADR chain; history isn't lost.
When to use · Common errors
  • ✓ Framework · database · protocol · regional-infra choices · company-wide build vs buy · critical-dependency up/down-grades
  • ✓ Any decision where the future will ask "why this?"
  • ✗ Skip for: day-to-day implementation calls (third-party package, variable name) — too frequent, ADRs become noise
  • Error 1: Treating ADRs as wiki — they get edited, no one remembers the original. ADRs are immutable.
  • Error 2: Decision but no Context — the reader cannot judge "does this still hold?"
  • Error 3: No Consequences — the next engineer doesn't see the tradeoff the author already understood, re-discovers it the hard way
  • Error 4: Written too late — backfilling an ADR after execution is "writing the exam after taking it"; the reasoning trace is gone
Key references

Michael Nygard《Documenting Architecture Decisions》Cognitect blog 2011 — the five-field source · ThoughtWorks《Technology Radar》— "Lightweight Architecture Decision Records" (Adopt, 2018) · adr.github.io — templates and community resources

This Week's Exercise + Reflection

Exercise: Take the most important technical decision you've made in the last six months (with or without an ADR) and backfill the five fields. Spend the most time on Context — back at that moment, what didn't you know that you know now? Write that delta.
Reflect: ADR immutability seems to fight organizational learning — people get smarter, knowledge updates. How do you design the relationship between ADRs and their successors so history isn't rewritten but learning still propagates?

Principle 04

Code Comments: WHY in Comments, WHAT in Names

Code Comments — Why & Why-not, Not What
Comments · Philosophy
Principle + Master's Words

Comments explain WHY and WHY-NOT; names explain WHAT. Overlap is noise. When you find yourself adding a line to explain what the code does, first ask: can I rename a variable or extract a function instead?

"The most important reason to write comments is abstraction: comments can describe things that can't be inferred from the code. Without comments, you can't hide complexity." — John Ousterhout《A Philosophy of Software Design》Ch. 12 (Stanford, 2018)

Pair with Robert C. Martin《Clean Code》Ch. 4: "Don't comment bad code — rewrite it." If you need a paragraph of prose to defend the function, the function is the problem.

Why it works

A three-layer division of labor:

DimensionCarrierExample
WHAT it doesNamesretryBudget · archiveUsersBefore()
HOW it does itCode itselfFunction body
WHY this wayComments# 3 because upstream SLA P95 is 800ms
WHY NOT another wayComments# not async: workers share a db conn pool
InvariantsComments# list must stay sorted — bisect downstream
ContractsDocstrings"Returns timestamp in UTC ms; raises TimeoutError after 5s"

One under-used role of comments: mark future invariants — e.g. # this list must stay sorted — binary search downstream. A documented constraint that warns the next engineer: "check before you change this."

Revision examples
Comment restates code — zero information:
# increment i
i += 1
Comment explains where "3" comes from:
# retry budget is 3 — beyond this upstream
# (payments-svc) gives up, so we should too
i += 1
Comment is a vague fragment; magic numbers hardcoded:
# fix for legacy users created before 2018
if user.created_at < datetime(2018, 1, 1):
    user.role = "legacy"
Extract constants to carry the WHAT; the comment points to an ADR:
# Pre-cutoff users predate the role-schema migration
# in ADR-019; default to LEGACY so they don't get
# write-locked by new RBAC checks.
if user.created_at < LEGACY_USER_CUTOFF:
    user.role = ROLE_LEGACY
Comment doing the work of a name:
// loop through users
for (const u of users) { ... }
Delete the comment — for (const u of users) is self-evident. A short form of Martin's rule: "Don't comment obvious code — delete the comment."
When to use · Common errors
  • ✓ Any non-obvious "why this number" / "why not that approach" / "future readers must be careful here"
  • ✓ Public APIs / SDKs / cross-team interfaces — contracts must live in docstrings (pre/post-conditions, exceptions, units)
  • ✗ Skip for: restating code · decorative banners · version numbers (git already tracks) · authorship (lives in LICENSE)
  • Error 1: Comments doing the WHAT job — rename or extract a function instead
  • Error 2: Lying comments — code changed, comment didn't. A stale comment is worse than no comment because it actively misleads
  • Error 3: JSDoc/docstrings as decoration — type-only, no semantics; the reader still doesn't know units or failure modes
  • Error 4: A paragraph of prose excusing a tangled function — if the function needs prose, the function is what to fix
Key references

John Ousterhout《A Philosophy of Software Design》(2018, 2nd ed. 2021) Ch. 12–15 — the core text on comments and abstraction · Robert C. Martin《Clean Code》Ch. 4 — the 17 rules of comments, controversial but useful · Steve McConnell《Code Complete》Ch. 32 — self-documenting code

This Week's Exercise + Reflection

Exercise: In your most recent PR, apply a three-way test to every comment: (a) restates code → delete; (b) explains WHY → keep; (c) names could replace it → rename and delete. Bonus: find a comment from three months ago and check whether it still matches the code — update or delete. This is the most under-rated piece of comment hygiene.
Reflect: When AI can instantly generate "what this code does" comments, the marginal cost of WHAT-comments approaches zero. Is that a feature or a bug? Can AI generate WHY-comments — or is WHY forever the human author's territory?

Deep Dive

For Further Reading

Books, Essays, Talks
REF · Further
  • Malte Ubl《Design Docs at Google》industrialempathy.com 2020 — the public-facing description of Google's internal RFC culture
  • Daniele Procida《Diátaxis》diataxis.fr — the four-quadrant doc structure, the most influential tech-doc framework of the last decade
  • Michael Nygard《Documenting Architecture Decisions》Cognitect blog 2011 — three pages that changed how a generation of engineers track decisions
  • John Ousterhout《A Philosophy of Software Design》Ch. 12–15 — Stanford CS 190 textbook, source for the comments-and-abstraction view
  • Robert C. Martin《Clean Code》Ch. 4 — 17 rules of comments; controversial but practical
  • Will Larson《An Elegant Puzzle》Ch. 4 ·《Staff Engineer》Ch. 6 — writing as leverage inside engineering organizations
  • Google《Technical Writing Courses》developers.google.com/tech-writing — free two-stage course; thirty minutes that pays off for years
  • Stripe API Docs docs.stripe.com — textbook Diátaxis in production; worth reverse-engineering the information architecture
Reflection

Open Questions

For the Practitioner
Q · Open
1. After AI drafts the RFC, who owns the "writing-is-thinking" return?
The risk is real — AI removes the "things fall out while you type" effect, and the author skips the reasoning. A practical hedge: let AI handle the skeleton and the prose (section headers, transitions, grammar polish), but the author drafts the load-bearing sections by hand — Goals, Non-goals, Alternatives Considered, Risks. The test: after you're done, can you verbally justify every rejected alternative? If yes, you kept the return. If no, you outsourced too much.
2. How can ADR immutability and organizational learning coexist?
Three paths: (a) the ADR chain — new decisions write new ADRs marked "Supersedes ADR-042"; originals stay; (b) separate ADRs from RFCs — RFCs are mutable (still under discussion), ADRs are immutable once Accepted; (c) add a status field to ADRs allowing "Deprecated" but never an edit. The underlying tension is git-commit thinking vs wiki thinking: knowledge bases evolve, but the history of decisions must not be rewritten — otherwise no future reader can reconstruct "what they were thinking at the time."
3. Why is Diátaxis less adopted in Chinese tech-doc communities?
A few hypotheses: (a) Chinese SaaS/platform markets matured later, with a smaller body of public dev-facing docs; (b) Chinese doc tradition favors the "tutorial + reference combined into a definitive book" form (《XX 权威指南》), the opposite of Diátaxis's "four independent forms"; (c) the tooling — most Diátaxis-style sites lean on Sphinx / MkDocs / Docusaurus, with thinner coverage of Chinese in the doc-as-code chain. A way in: introduce Diátaxis as an IA framework first, not a styling rule. Categorize before formatting.
4. Bilingual code comments — how should mixed-language teams choose?
Three cases: (a) Internal small tools — use the team's main language consistently; (b) Open source / cross-region collaboration — must be English, and term-consistent; (c) Mixed Chinese/foreign teams — a workable compromise is "WHY-comments in the main language, TODOs and link references in English." The worst trap is mixing — same-file Chinese-English comments break grep and confuse AI assistants. A simple rule: comments live in either the codebase's lingua franca or fully in another language — never a blend.
5. "Writing for future you" vs "writing for the new hire" — which audience produces better docs?
The "new hire" persona produces friendlier docs (background, term definitions) but skews toward tutorial-shape — content-heavy, decision-light. The "future you" persona produces tighter, decision-dense docs with fat Alternatives sections — because the future you is impatient and wants the rejected B in detail. A useful blend: write the body for "future you" (high decision density); add a 30-word "for context" first paragraph (the new-hire on-ramp). Both audiences served, neither diluted.