// Engineering Notes

One Human + a Fleet of Agents:
How This Hub Is Built

BigCat · 2026-07 · Tech Blog

BigCat's Learning Hub is a family of static sites I built for my own learning: mental models, AI/ML, system design, CS papers, philosophy, Buddhism, parenting, investing — 20+ subjects, each living in its own GitHub Pages repository. They update themselves daily, ship in both Chinese and English, and come with site-wide search, comments, and text-to-speech — while day-to-day operation requires almost none of my time. This post walks through how the whole pipeline is put together.

0. What it looks like

One hub landing page aggregates every site as a card, automatically shows each site's last-updated date, and stamps a "✓ Completed" badge on any site that has finished its roadmap.
Each subject site is an independent repo — ai-ml, mental-models, system-design, cs-papers-deepread — adding one page of in-depth content per day (or every other day / weekly), as a standalone Chinese HTML file plus a standalone English one.
Site-wide infrastructure: Pagefind static search (press /), Giscus comments, an image lightbox, language toggle + TTS narration — all provided by shared JavaScript; content pages don't carry a single line of it themselves.
Hosting cost is zero: everything runs on GitHub Pages and the free GitHub Actions tier; the only external service is Azure Speech's free tier for TTS.

The core idea in one sentence: I only maintain each site's roadmap (TOPICS.md); production, publishing, post-processing, aggregation, and inspection are all automated — with the "write the content" step delegated to Claude agents running on a schedule in the cloud.

1. Overall architecture

        ┌─ Human (me) ────────────────────────────────┐
        │  curate TOPICS.md roadmaps · set caps ·       │
        │  review monthly suggestions                   │
        └──────────────┬──────────────────────────────┘
                       ▼
  ┌── Generation: claude.ai cloud routines (20 triggers) ──┐
  │  staggered cron → repo mounted → follow CLAUDE.md spec  │
  │  pick topic (fs-idempotent) → write zh+en pages →       │
  │  update index → publish.sh                              │
  └──────────────┬─────────────────────────────────────────┘
                 ▼  git push
  ┌── Repos: 20+ content repos (GitHub Pages) ─────────────┐
  │  TOPICS.md · CLAUDE.md · .maxchars · publish.sh gate    │
  └──────────────┬─────────────────────────────────────────┘
                 ▼  on push
  ┌── Post-processing: per-repo GitHub Actions ────────────┐
  │  inject shared JS · (mental-models) Azure TTS bake      │
  └──────────────┬─────────────────────────────────────────┘
                 ▼  daily cron
  ┌── Aggregation: hub repo GitHub Actions ────────────────┐
  │  refresh-hub.yml re-renders index ·                     │
  │  build-search.yml rebuilds Pagefind index               │
  └──────────────┬─────────────────────────────────────────┘
                 ▼
  ┌── Governance ──────────────────────────────────────────┐
  │  auto-pause finished routines + English-page leak scan  │
  │  · monthly frontier-refresh meta-routine                │
  └────────────────────────────────────────────────────────┘

The layering principle: each layer trusts only the artifacts of the layer below it, never that layer's process. Generation can go wrong, so the repo layer has a validation gate; the gate can miss things, so the governance layer patrols.

2. Anatomy of a content repo: four files define a site

Besides the HTML pages, each content repo contains exactly four things, each with one job.

TOPICS.md — the human-curated roadmap (read-only to the bot)

The only part of the system that needs my ongoing attention. It lists, in order, every topic the site will cover; several sites also state a hard cap (mental models is capped at issue 68). The key design decision is one-way authority: the routine may only read it — publish.sh flatly rejects any commit that modifies TOPICS.md. When topics run out, the routine is not allowed to extend the roadmap itself; it can only send me a push notification asking for a refill.

Drawing the line here is what keeps the whole system on course: the AI decides how to write; the human decides what gets written.

CLAUDE.md — the writing spec

Sites that need precise control over layout and depth (system design, paper deep-reads, daily book deep-reads) carry a detailed execution spec: target reader (senior engineers), length band, required section structure (the paper site's nine-part skeleton: one-liner → glossary → context → problem & motivation → core idea → key results → impact → limitations & critiques → takeaways), color palette (each site has its own visual signature — system design is dark cyan, papers are amber-copper), and honesty requirements — uncertain quotes must be marked as paraphrase, and limitations and counterarguments must be written.

The most useful trick in these specs: anchor on a published page. The prompt points at a specific article ("match read1 for depth, format, and voice") and says: do it like that. A concrete exemplar is far more stable than any prose description of style.

.maxchars — the length ratchet

A one-line file containing a single number (3500/4000/5000). publish.sh counts the CJK characters of every new page and enforces this ceiling. This one was earned the hard way: LLM-generated series exhibit a "length ratchet" — each article comes out slightly longer than the last, because the model anchors on the most recent pages, and a few dozen days later the pages have bloated out of control. The fix clamps from both sides: a target band in the prompt, a hard ceiling at the gate.

publish.sh — the publishing gate

All repos share the same ~150-line bash script; the routine must publish through it. What it checks:

New page > 2KB (no hollow pages)
CJK character count ≤ .maxchars (no bloat)
The new page is referenced from index.html (no orphan pages)
Balanced <div> tags (no truncated HTML — truncated LLM output really does happen)
No duplicate issue number (no overwriting existing content)
No hard-coded shared scripts (that's the injection layer's job — see below)
TOPICS.md untouched
All green → automatic git add/commit/push, with the commit message normalized to Add #N: title

That last convention is no small thing — the commit message itself becomes a machine-readable publishing record; downstream completion detection and hub badges all work by parsing it.

3. The generation engine: cloud Claude routines

Content is produced by scheduled agents ("routines") running on claude.ai — currently 20 triggers. Each trigger's job consists of:

A cron expression: all routines are staggered across 10:00–16:00 UTC at roughly 15-minute intervals so they never run at once
A mounted repository: the working directory is the corresponding GitHub repo — the agent can ls, read old pages, and git push
The model: Opus tier — depth of content is this system's whole reason to exist, so this is not where you economize
A tool allowlist: Bash / Read / Write / Edit / Glob / Grep / WebSearch / WebFetch / PushNotification — enough and no more
One prompt: a complete description of the loop from topic selection to publication

Take the daily book deep-read routine. Its prompt is five steps:

Pick a topic: choose the lowest-numbered entry in TOPICS.md not yet done — where "done" is determined by ls *-read*.html. The filesystem is the database: no state store anywhere, and repeated firings are naturally idempotent, because the next run sees a different file listing.
Write: follow CLAUDE.md's section structure; err on the side of depth; gloss technical terms in English on first mention; never fabricate.
Land both languages: {slug}-read{N}.html + {slug}-read{N}.en.html, each required to read natively rather than as a stiff translation, cross-linked via a language bar; both language index pages get updated too.
Publish: run ./publish.sh; through the gate means live.
Notify: a PushNotification lands on my phone — "updated + one-line essence + link."

The prompt's last sentence is "complete autonomously, wait for no confirmation" — nobody is present when a cloud routine runs, so any step that waits for approval is a deadlock.

Beyond the 20 content routines there are two meta-routines:

Monthly frontier refresh: on the 1st of each month, check whether the roadmaps of fast-moving fields (AI/ML, agent engineering) have fallen behind, verify candidates via web search, and write suggestions into the hub repo's ROADMAP-SUGGESTIONS.md — suggestions only; it never edits any TOPICS.md directly. Classical fields (philosophy, Buddhism, mathematics) default to "nothing new this month." Every suggestion must cite a source verified to exist; when in doubt, leave it out.
Cross-reference sync: periodically cross-links ai-ml (academic concepts) and super-individual (engineering practice) and keeps the two from covering the same ground twice.

4. Post-processing: what happens after the push

The routine clocks out after pushing, but the page isn't in final form yet. Each repo's GitHub Actions take over for two kinds of post-processing.

Shared-script injection (all repos)

Comments, search, bilingual TTS, navigation buttons, and the lightbox are all provided by shared scripts hosted in the hub repo (comments.js, search.js, i18n-tts.js, index-button.js, lightbox.js). Content pages are forbidden from hard-coding these script tags (publish.sh blocks it); instead, an injection Action scans the HTML after each push, adds whatever is missing, and auto-commits as Auto-inject shared scripts.

Why injection instead of baking the tags into the generation template? Because infrastructure must be able to evolve independently of content. With 20+ repos and hundreds of pages, script tags hard-wired into templates would mean retraining 20 prompts and re-touching hundreds of pages just to upgrade the search script. Under injection, you change the shared script once and the whole fleet picks it up on the next pass — and old pages get retrofitted for free.

Azure TTS bake (mental-models)

Every section of the mental-models site is click-to-listen, and the audio is pre-baked:

A push triggers bake-tts.yml → runs bake-tts.py
Text is extracted per <h2> section → hashed → check whether audio/zh/<hash>.mp3 already exists → only call the Azure Speech REST API when it doesn't
The script then writes data-tts-zh="<hash>" back onto the HTML element; the front-end i18n-tts.js resolves audio by that attribute. Sites without pre-baked audio degrade gracefully to the browser's Web Speech API
The bake's auto-commit is tagged [skip bake], breaking the "commit triggers bake, bake makes a commit" loop

Hash addressing makes the whole chain idempotent: unchanged content costs zero API quota, and editing one section re-bakes only that section. The site has accumulated 500+ mp3 files, about 1.2 GB — all sitting in the git repo, served directly by Pages, at zero extra storage cost.

(This TTS chain originally ran on Volcano Engine and was later migrated wholesale to Azure; the old script survives as a .bak fossil.)

5. Aggregation: how the hub tracks 20 repos beneath it

The hub repo is fully automated too, via two daily Actions.

refresh-hub.yml — re-rendering the landing page

Runs generate_hub.py, a textbook case of "one source of truth, two language renders": card metadata (titles, bilingual blurbs, palettes, sections) lives in a single CARDS array, and both the Chinese and English pages render from it — there are never two HTML files to keep in sync.

The dynamic parts come from the GitHub REST API:

For each repo, fetch the last 50 commits and find the most recent one that actually added a content page (filename regex matching -dayN/-weekN/-readN.html), and render that date on the card — so the badge reflects "content last updated," not a "last commit" date polluted by the bot's injection commits.
Completion detection: the script holds a CAPS table (which site caps at which issue), parses the maximum Add #N from commit messages, and when a site reaches its cap, swaps the date for a "✓ Completed" badge. This is the downstream payoff of the commit-message convention — no one has to mark a site finished; each site announces its own graduation.

Timing-wise, this Action runs about 45 minutes after all content routines, so the day's new pages always make it to the landing page.

build-search.yml — site-wide search

Static sites have no backend, so search is Pagefind: every day, clone all content repos into _src/ as one tree, run Pagefind to build a sharded index (Chinese and English indexed separately), and publish it to /pagefind/ in the hub repo. The front-end search.js provides a floating search button and opens the search overlay in the page's language. Footers, navigation, and comment containers are excluded from indexing as noise.

One special case: the Thinker Roundtable site (thinker-arena) is client-rendered — its content lives in JSON, invisible to a crawler. The fix is render_search_snapshots.py, which renders the JSON debates into plain HTML snapshots purely for Pagefind to consume before indexing — SSR for the crawler's benefit, just as a daily batch job.

6. Governance: making the system police itself

Getting it running is the easy part; running unattended for the long haul is the hard part. The lines of defense:

Caps + auto-pause

Every site's roadmap is finite — when the material is learned, the site should graduate, not pad itself for the sake of a daily streak. The cap is recorded in three places: TOPICS.md (visible to the routine), the hub's CAPS table (badges), and a local scheduled inspection task: weekly, it checks each capped site's highest published issue, and any site that has written to its cap gets its cloud trigger flipped to enabled=false via the API.

The pause operation has two safety interlocks, both paid for with real mistakes:

Before pausing, get the trigger and confirm that the repo it mounts really is the one being retired — protection against a trigger ID mapped to the wrong site;
The update body sends only {"enabled": false}, because this API's update replaces job_config wholesale rather than merging — sending a partial job_config would wipe out the prompt, the mounted repo, and the model configuration.

Chinese-leak detection on English pages

The most common failure mode of bilingual generation is Chinese leaking into the English page's template slots (subtitles, tags, name fields). The same inspection task runs a set of fingerprint greps (like class="en">[一-鿿]) over every repo's *.en.html and reports any hit. The fingerprints are chosen carefully — legitimate Chinese, like Buddhist scripture alongside its English translation or glossed terms, doesn't use those classes and never false-positives.

The quality ratchets

Length ratchet: the .maxchars hard ceiling plus the prompt's target band, as above
Truthfulness discipline: every routine's prompt carries a "never fabricate, verify sources first, if you can't find it don't write it" clause; the monthly refresh routine further requires a verified link on every suggestion
New routines run a small validated batch before scaling up (especially TTS baking — one structural mistake burns a whole month of Azure quota)

7. Design principles that emerged

Static first. No database, no server, no build framework. HTML is the artifact, git is the CMS, GitHub Pages is the CDN. Search (Pagefind), comments (Giscus), and TTS (pre-baked mp3s) all solve traditionally "needs a backend" problems with static answers.

The filesystem is the state. Topic progress = the output of ls; the publishing record = commit messages; completion status = parsed from commits. There is no independently maintained state store anywhere — so nothing can drift out of sync with reality.

Every step is idempotent. A re-fired routine won't rewrite existing pages (the file already exists); a re-run TTS bake won't double-bill (hash hit); a re-run injection won't stack tags (only adds what's missing). For unattended systems, retries are the norm — idempotency is a precondition, not an optimization.

Separate generation from validation. Never assume LLM output is correct; publish.sh backstops it with the dumbest possible bash checks. The smarter the generator, the dumber and more deterministic the validator should be.

The human guards exactly one entrance. All my recurring effort converges on TOPICS.md and cap decisions; everything else derives from them. The AI expanding its own mandate (extending the roadmap) is mechanically forbidden — it can only file a request.

Convention over protocol. The Add #N commit format, the {slug}-day{N}.html naming, the one-line .maxchars file — components talk through these humble conventions. There isn't a JSON schema anywhere, yet every convention has at least two consumers.

Graduate when it's time. Every site has a cap; when it's reached, the routine auto-pauses and the badge flips to Completed. This system serves my own learning — a finished field deserves closure, not hostage-taking by "consistent output."

8. Cost

GitHub Pages / Actions: within the free tier
Content generation: cloud routines included in a claude.ai subscription, no extra API bills
Azure Speech: free tier (F0)
My time: curating 20+ TOPICS.md files, plus the occasional glance at monthly suggestions and inspection reports

One person's attention is the most expensive resource in this system. The design goal of the pipeline was never "full automation" for its own sake — it was to free my attention from production and operations and spend all of it on the only thing worth spending it on: deciding what to learn next.