Day 14 Hard Feed System Push/Pull/Hybrid Fanout Ranking

Feed System — The Fanout Art of a 100M-DAU Following TimelinePush vs Pull vs Hybrid, Fanout-on-write Limits, Timeline Storage, Ranking Pipeline

Problem & Constraints

Design the backend for a 100M-DAU following timeline (Twitter/X Home, Instagram Feed, Weibo following stream): you follow a few hundred accounts, and pull-to-refresh must return their newest, sorted posts within p99 < 200ms. The hard part isn't storing posts — it's the read/write amplification asymmetry: one post must reach millions of followers, while one refresh must aggregate and sort across hundreds of followees. The first question: do you do the merge work at write time or read time?

High-Level Architecture (Hybrid Fanout)

graph TD
    POST["Post / Write API"] --> TW[("Tweet Store
source of truth")] POST --> FO["Fanout service
look up social graph"] FO -->|normal: push| TLC[("Home Timeline Cache
Redis · one bounded list/user")] FO -.celeb: skip fanout.-> CEL[("Celebrity posts
pulled at read time")] READ["Refresh / Read API"] --> MIX["Timeline mixing service"] MIX -->|① read pre-materialized| TLC MIX -->|② pull celeb latest| CEL MIX --> RANK["Rank + diversity"] RANK --> HYDRATE["Hydrate
fetch post bodies by id"] HYDRATE --> TW HYDRATE --> OUT["Return 20-post feed"] classDef w fill:#1a2530,stroke:#64c8ff,color:#e8eef5 classDef cache fill:#1a1a30,stroke:#ffb450,color:#e8eef5 classDef store fill:#2a1530,stroke:#ff7ab6,color:#e8eef5 class POST,READ,FO,MIX,RANK,HYDRATE,OUT w class TLC,CEL cache class TW store

Write path pre-materializes normal users' timelines; read path merges "pre-materialized" + "celebrity real-time pull" then ranks

Component roles: Tweet Store is the source of truth for posts (a KV sharded by tweet_id). The Fanout service, on each post, looks up the social graph and pushes the tweet_id into every follower's Home Timeline Cache — that's "do more on write". Celebrities are the exception: fanout cost is too high, so their posts aren't pre-materialized and are pulled at read time. The mixing service is the crux: at read time it merges the pre-materialized timeline with celebrities' latest posts, ranks them, then hydrates only the top-N (timelines store only ids; bodies are fetched from Tweet Store to save memory).

Key Technical Points

1. Push vs Pull vs Hybrid: merge at write time or read time

One-line trade-off: trade "write amplification at post time" for "read amplification at refresh time" — you can only save on one end, and celebrities force you to manage both.

Principle: Push (fanout-on-write) writes the post id into every follower's timeline at post time, read takes it out in one shot — read O(1), write O(followers). Pull (fanout-on-read) precomputes nothing; on refresh it queries all your followees' recent posts and merge-sorts them — write O(1), read O(followees × per-query). The choice depends on read:write ratio and fanout distribution: read-heavy (100:1) with most users having few followers → push amortizes cost onto the sparse writes, a win; but a celebrity with tens of millions of followers triggers tens of millions of writes per post, and push collapses. So industry is almost always Hybrid: push for normal users, pull for celebrities, merge at read time.

Push (write fanout)Pull (read fanout)Hybrid
Post costhigh (O followers)low (one write)normal high / celeb low
Refresh costlow (read one list)high (merge N sources)read 1 list + pull few celebs
New-post delayafter fanout donereal-timenormal lag / celeb real-time
Failure modecelebrity write blowupactive-user read blowupmerge logic complexity
Trade-off:
Real-world cases:

2. Fanout-on-write amplification and the celebrity problem

One-line trade-off: push converts read cost into write cost, and write cost explodes linearly with follower count — unsustainable at celebrity scale.

Principle: the fanout write count for one post = the author's follower count. A normal person triggers a few hundred cache writes, easily absorbed; but an account with 50M followers triggers 50M Redis writes per post — even at 100µs each, serial is hours, and it instantly saturates cache-cluster bandwidth. Worse is thundering-herd overlap: celebrities have many online followers at once, so fanout storm and read storm collide. The fix is a fanout threshold: accounts above it join a "celebrity list", their posts skip fanout and are pulled and merged into each reader's timeline at read time. The cost is one extra "pull celeb latest + merge" on the read path, but it avoids the write-side avalanche.

# Hybrid fanout: split by threshold at post time (pseudo-code)
FANOUT_THRESHOLD = 100_000  # followers above this -> read-time pull

def on_post(author_id, tweet_id, ts):
    tweet_store.put(tweet_id, ...)          # always write source of truth first
    n = social_graph.follower_count(author_id)
    if n > FANOUT_THRESHOLD:
        celeb_recent.zadd(author_id, ts, tweet_id)  # celeb latest, pulled at read
        return                                # no fanout
    # normal user: async fanout, batched pipeline, don't block the post
    for batch in chunks(social_graph.followers(author_id), 1000):
        pipe = redis.pipeline()
        for fid in batch:
            pipe.zadd(f"tl:{fid}", ts, tweet_id)
            pipe.zremrangebyrank(f"tl:{fid}", 0, -801)  # bounded: keep newest 800
        pipe.execute()
Trade-off:
Real-world cases:

3. Timeline storage: bounded list, store ids not bodies

One-line trade-off: trade "one extra hydrate to fetch bodies at read time" for "shrinking the timeline cache memory by one or two orders of magnitude".

Principle: the home timeline is an ordered structure in Redis (a sorted set / list scored by time), one per user, storing only tweet_ids, not bodies. Why: ① a body is a few KB; storing 100M users × 800 entries of full bodies is a TB-scale memory disaster, while 8-byte ids shrink it 100×; ② posts can be edited/deleted, and a body redundantly copied into every timeline would need to be re-fanned-out on update. At read time you take the top-N ids and batch-hydrate bodies from Tweet Store (one MGET), filtering out deleted/blocked ones. The timeline must also be bounded — keep only the newest few hundred, truncate older, or an active big account's followers' timelines grow unboundedly. Cold data (scrolling far back) falls back to pull.

# Read path: merge + hydrate (pseudo-code)
def get_home_timeline(uid, limit=20):
    ids = redis.zrevrange(f"tl:{uid}", 0, 400)        # ① pre-materialized part
    for cid in following_celebs(uid):                  # ② pull celeb live
        ids += celeb_recent.zrevrange(cid, 0, 50)
    ids = dedup(ids)
    ranked = rank(uid, ids)[:limit]                    # ③ rank, take top-N only
    tweets = tweet_store.mget(ranked)                  # ④ hydrate only top-N
    return [t for t in tweets if visible(uid, t)]      # ⑤ filter deleted/blocked
Trade-off:
Real-world cases:

4. Ranking Pipeline: from chronological to ML ranking

One-line trade-off: trade "engagement uplift" for "real-time, explainability, and compute budget" — the ranking model can only run on the small candidate set after aggregation.

Principle: early feeds were pure reverse-chronological — simple, real-time, predictable. After information overload, the shift was to ML ranking: after aggregating candidates (push-materialized + pull-celebs, a few hundred), a model predicts multiple engagement signals per post (like/comment/reshare/dwell probability), weighted into a score. This echoes Day 13's multi-stage funnel — the difference is feed candidates come from the follow graph not full-corpus retrieval, so the candidate set is far smaller (hundreds vs billions), allowing a fairly heavy ranking model directly. After ranking comes re-ranking to inject diversity (spread same-author, avoid a single-topic streak) and business rules (ad insertion, recency boosts).

Trade-off:
Real-world cases:

Scaling & Optimization

Common Pitfalls + Interview Questions

1. Agonizing over push vs pull from the start. The right answer is almost always hybrid. The interviewer wants the threshold thinking ("split by follower count/activity") and how to merge two sources at read time — not a binary choice.
2. Stuffing bodies into every timeline. Memory blowup + delete/edit needs full re-fanout. Timelines store only ids, hydrate at read.
3. Forgetting timelines must be bounded. Without truncating old entries, a big account's followers' single keys bloat unboundedly, and a Redis big-key drags down the whole shard.
4. Ranking at write time. Fixing order at push means new posts and model updates can't reflect live; re-ranking belongs on the read path.
5. Ignoring visibility filtering. If blocked/deleted/private posts aren't filtered at read, deleted content gets served — you must filter after hydrate.

Likely follow-ups: ① A celebrity with tens of millions of followers posts — how do you avoid blowing up the system? ② How do you set the push/pull threshold, on what metric? ③ When I newly follow someone, how do their historical posts enter my timeline (backfill)? ④ How do you budget the end-to-end post-to-visible latency? ⑤ Chronological vs ranked feed — what does each sacrifice, and which product should pick which?

Deep Resources

Deep Thinking (click to expand)

1. Push's write amplification grows linearly with follower count, pull's read amplification grows linearly with followee count. Can you write an inequality to estimate the "push or pull" crossover?

Modeling: for a single account, push is worthwhile roughly when "the number of times it's read ≫ the number of fanout writes its posts trigger" — the crossover is essentially the marginal benefit of a fanout write: when one fanout write saves less expected read cost than the write itself costs, switch to pull.

Intuition: a normal person posts rarely and is refreshed repeatedly → reads ≫ writes → push (write once, save countless reads). A celebrity's follower count both raises write cost and, because their posts get buried by ranking with low per-person exposure, lowers write benefit — squeezed from both ends, they land in the pull region. This is the theoretical basis for the hybrid threshold.

2. User A just followed big-V B (5M followers). Should B's historical posts appear in A's timeline immediately? How to backfill without stepping on landmines?

Crux: B is a celebrity served by pull and isn't pre-materialized — so "at follow time" you actually don't need to write historical posts into A's timeline; A's next refresh will auto-merge B's latest. This is a hidden benefit of hybrid: following a celebrity needs no backfill.

If B is a normal account (push): B's historical posts aren't in A's pre-materialized timeline. Options: pull B's recent posts and merge at read time on first refresh (simple, common), or async zadd them into A's timeline in the background. Pitfall: don't backfill inside the synchronous "follow" request — if A follows a batch at once (contacts import), synchronous backfill amplifies into a mini fanout storm; it must be async, rate-limited, and respect the bounded truncation.

3. Timelines store ids in a Redis sorted set. If a Redis shard goes down, that batch of users' timelines is gone. Is this data loss?

Key insight: the home timeline is derived data, not the source of truth — the real posts are in Tweet Store. Losing the timeline cache isn't "data loss", it's a "materialized view invalidation" that can be rebuilt.

Recovery: ① short-term degrade to pull — that shard's users refresh by live-merging their followees' latest, slow but usable, buying time for rebuild; ② background re-fanout rebuild, at the cost of a degraded read path + a write storm (rate-limit it). This also explains why timelines must be bounded — rebuild only needs to backfill the recent few hundred, not full history. Counter-lesson: if you'd made the timeline the only store (no Tweet Store), this would be true data loss — a derived layer must never be authoritative.

4. Switching from chronological to ML ranking, DAU engagement rose, but "why didn't I see friend X's post" complaints surged. How does this second-order effect arise? How to mitigate?

How it arises: under chronological, "follow = guaranteed delivery", users have a stable mental model. After switching to ranking, the model filters/downranks by predicted engagement, and low-interaction friends' posts get buried; users don't know ranking exists and just feel "I followed them but didn't see it". This is the inherent cost of a ranked feed: trading individual predictability and trust for aggregate engagement uplift.

Mitigation: ① give strong-tie/explicit subscriptions a guaranteed exposure slot; ② offer a "Latest" chronological view as a switchable tab (Twitter/Instagram both did this); ③ add diversity and coverage constraints in ranking so top-interaction accounts don't dominate; ④ clearly label "For You vs Latest". Fundamentally it's admitting offline engagement metrics ≠ user trust, needing ecosystem-side metrics as a backstop (echoes Day 13's echo-chamber discussion).

5. Estimate: 100M users, 800-entry bounded timelines each, 8 bytes per tweet_id. How much memory just for the timeline cache? What does this tell you architecturally?

Naive estimate: 1e8 users × 800 entries × 8 bytes ≈ 640 GB of raw id data alone. But a Redis sorted set element also stores a score (8-byte double) plus skiplist/dict pointer overhead, so realistically 50~100 bytes per element, bloating the total to several TB.

Architectural implications: ① one machine can't hold it → the timeline cache must be sharded (hash by uid, echoes Day 4); ② this is just ids — thankfully no bodies, or 800 × few KB × 1e8 = PB-scale, the hard constraint behind "store ids + hydrate"; ③ memory is too expensive → activity-aware materialization and tiering (hot in memory, cold to Cassandra) are necessities not optimizations; ④ it also explains why the bounded 800 matters — it directly sets the cache tier's total memory budget; double the cap, double the cost.