Design Facebook's News Feed
Like you should in an interview. Explained as simply as possible… but not simpler.
In this issue, I walk through the exact thinking I’d use in a system design interview out loud, step by step. Clear, practical, and including trade-offs you can defend.
What you’ll learn in ~15 minutes
How I would scope the problem without missing important requirements.
Fan-Out Patterns - Push, pull, and hybrid models for feed generation
Hot Key/Hot Shard Problems - How viral content can overwhelm systems and solutions to distribute load
Global Secondary Indexes (GSI) - Database design patterns for supporting multiple query types efficiently
How this issue is structured
I split the write-up into the same sections I’d narrate at a whiteboard. Free readers get the full walkthrough up to the deep-dive parts. Paid members get the 🔒 sections.
Initial Thoughts & Clarifying Questions
Functional Requirements
Non-Functional Requirements
Back-of-the-envelope Estimations (QPS, storage, bandwidth, cardinality math)
🔒 System Design (the architecture I’d draw and the excalidraw link for it!)
🔒 Component Breakdown (why each piece exists + alternatives)
🔒 Trade-offs Made
🔒 Security & Privacy
🔒 Monitoring, Logging, and Alerting
🔒 Handling the Fan-Out Problem
Quick note: If you’ve been getting value from these and want the full deep dives, becoming a paid member helps me keep writing—and you’ll immediately unlock the 🔒 sections above, plus a few extras I lean on when I practice.
Members also get
12 Back-of-the-Envelope Calculations Every Engineer Should Know
My Excalidraw System Design Template — drop-in canvas you can copy and tweak.
My System Design Component Library
Let’s get to it!
Initial Thoughts & Clarifying Questions
To begin, I'd want to understand the exact scope and constraints we're working with here. The newsfeed problem can go in many different directions, so let me ask some clarifying questions:
1. Are we building a Facebook-style bidirectional friendship system or a Twitter-style unidirectional follow system? Based on the context, I'm assuming we're building a unidirectional follow system where users can follow others without requiring mutual acceptance. This simplifies our data model and is closer to modern social platforms.
2. What type of content are we supporting in posts? I'll assume we're starting with text-based posts for now. Supporting multimedia would add complexity around CDN distribution and storage that we can address later if needed.
3. Are we implementing any feed ranking algorithms, or should this be purely chronological? For this design, I'm assuming we want a chronological feed ordered by post creation time. ML-based ranking systems would require a completely different architecture focused on feature extraction and model serving.
4. What's our target scale - how many users and posts per day are we expecting? Let me assume we're designing for a large-scale system with around 2 billion users globally. This will drive many of our architectural decisions.
5. Do we need real-time feed updates, or is some delay acceptable? I'm assuming that eventual consistency is acceptable - when someone posts, it doesn't need to appear instantly in all followers' feeds, but should appear within a reasonable timeframe, say 1-2 minutes.
6. Should we support feed pagination for infinite scroll? Yes, I'll assume we need cursor-based pagination to support mobile app infinite scroll experiences.
7. Are there any specific latency requirements for feed loading? I'll target sub-500ms response times for feed requests to ensure a responsive user experience.
Functional Requirements
From what I understand, the core requirements are:
Create Posts: Users must be able to publish text-based posts to their timeline
Follow Users: Users can follow other users unidirectionally (no mutual acceptance required)
View Personalized Feed: Users can view a chronological feed of posts from accounts they follow
Paginate Feed: Support infinite scroll through historical posts with cursor-based pagination
I'm keeping this focused on these core features. Additional functionality like likes, comments, or privacy settings would be great extensions, but I want to nail the fundamental newsfeed mechanics first. In a real interview, I'd mark those as "below the line" features we could discuss if time permits.
Non-Functional Requirements
I'd expect this system to handle several key non-functional requirements:
Consistency Model: We can leverage eventual consistency here. Users don't expect posts to appear instantaneously in their feed, but they should appear quickly - let's target within 1 minute for 99% of posts.
Latency: For a responsive user experience, I'd aim for sub-500ms latency for both posting new content and loading feeds. Users get impatient with slow social media experiences.
Scale: Given we're targeting Facebook-scale, I'm assuming around 2 billion registered users globally. This drives our need for horizontal scaling and distributed systems approaches.
Availability: Social media platforms need high availability - I'd target 99.9% uptime. Users expect the service to always be accessible.
Read-Heavy Workload: This will be an extremely read-heavy system. I'd estimate a 100:1 or even 1000:1 read-to-write ratio, as users consume far more content than they create.
Back-of-the-envelope Estimations
Let me work through some capacity planning numbers to size this system properly.
Starting with our user base: Let's say we have 2 billion registered users, with about 500 million daily active users. Of those DAU, maybe 50 million are actively posting each day.
Write Load Calculations:
50M daily posters
Average 2 posts per active poster per day
Total: 100M posts per day
Posts per second: 100M / (24 * 3600) ≈ 1,200 posts/second
Peak load (assuming 3x average): ~3,600 posts/second
Read Load Calculations: Each user might check their feed 10 times per day on average:
500M DAU × 10 feed requests = 5B feed requests per day
Feed requests per second: 5B / (24 * 3600) ≈ 58,000 requests/second
Peak load: ~175,000 feed requests/second
Storage Requirements:
Average post size: ~500 bytes (including metadata)
Daily posts: 100M × 500 bytes = 50GB per day
Annual storage: 50GB × 365 ≈ 18TB per year
With 5-year retention: ~90TB for post content
Follow Relationships:
Average follows per user: Let's say 200
Total follow relationships: 2B users × 200 = 400B relationships
Storage per relationship: ~20 bytes (follower_id, followed_id, timestamp)
Total follow storage: 400B × 20 bytes = 8TB
Feed Precomputation Storage: If we precompute feeds for faster access:
Store latest 200 posts per user feed
2B users × 200 posts × 8 bytes (post_id) = 3.2TB
Bandwidth:
Average feed contains 25 posts
Each post + metadata ≈ 1KB when serialized
Feed response size: 25KB
Peak feed bandwidth: 175K requests/sec × 25KB = ~4.4 GB/second
These numbers tell me we definitely need a distributed, horizontally scalable architecture with caching layers.
🔒 System Design
I'd start with a simple design and then elaborate on the scaling challenges. Let me sketch out the high-level architecture: