Design Leetcode
Like you should in an interview. Explained as simply as possible… but not simpler.
In this issue, I walk through the exact thinking I’d use in a system design interview out loud, step by step. Clear, practical, and including trade-offs you can defend.
What you’ll learn in ~15 minutes
How I would scope the problem without missing important requirements (custom aliases, expirations, availability).
Why choosing containers over serverless for code execution is better
The pattern that makes leaderboards lightning-fast instead of crushing your database with expensive GROUP BY queries
How to implement container security that prevents malicious code from taking down your entire platform
How this issue is structured
I split the write-up into the same sections I’d narrate at a whiteboard. Free readers get the full walkthrough up to the deep-dive parts. Paid members get the 🔒 sections.
Initial Thoughts & Clarifying Questions
Functional Requirements
Non-Functional Requirements
Back-of-the-envelope Estimations
🔒 System Design (the architecture I’d draw and the excalidraw link for it!)
🔒 Component Breakdown (why each piece exists + alternatives)
🔒 Trade-offs Made
🔒 Security & Privacy
🔒 Monitoring, Logging, and Alerting
🔒 Final Thoughts
Quick note: If you’ve been getting value from these and want the full deep dives, becoming a paid member helps me keep writing—and you’ll immediately unlock the 🔒 sections above, plus a few extras I lean on when I practice.
Members also get
12 Back-of-the-Envelope Calculations Every Engineer Should Know
My Excalidraw System Design Template — drop-in canvas you can copy and tweak.
My System Design Component Library
Let’s get to it!
Initial Thoughts & Clarifying Questions
To begin, I’d want to understand the scope and constraints we’re working with here. LeetCode is actually a more manageable system than many interview questions since it’s not dealing with billions of users, but it has some unique challenges around code execution that make it interesting.
Let me ask a few clarifying questions:
What’s our user scale? I’d assume we’re looking at roughly 500,000 registered users with maybe 50,000 daily active users during normal periods, but potentially 100,000 concurrent users during major competitions. This is based on LeetCode’s actual scale being relatively modest compared to social media platforms.
What types of problems are we supporting? I’m thinking we need to handle algorithmic problems with various input types - arrays, trees, graphs, etc. We probably have around 4,000 problems total, which is manageable from a data storage perspective.
What’s our competition model? I’d assume competitions run for about 90 minutes, feature 10 problems, and need real-time leaderboards. Scoring would be based on number of problems solved, with time as a tiebreaker.
What programming languages do we need to support? I’d expect at least Python, Java, C++, JavaScript, and maybe Go, Rust, and a few others. Each language needs proper execution environments.
What’s our performance expectation for code execution? I’d target sub-5 second response times for code submissions, including compilation, execution against test cases, and result generation.
How real-time does the leaderboard need to be? I’m thinking 5-10 second latency is acceptable - users don’t need instant updates, but they want to see their ranking relatively quickly during competitions.
Functional Requirements
From what I understand, the core requirements are:
Problem browsing and viewing - Users need to see a paginated list of problems and view individual problem statements with code stubs in their preferred language
Code submission and execution - Users submit solutions that get executed against hidden test cases with pass/fail feedback
Competition leaderboards - Real-time ranking during timed competitions based on problems solved and completion time
Multi-language support - Same logical problem should work across different programming languages
I’m deliberately keeping user authentication, profiles, and payment processing out of scope since those are standard components that don’t add much to the interesting system design aspects.
Non-Functional Requirements
I’d expect this system to handle:
Availability over consistency - It’s better to show slightly stale leaderboard data than to have the system go down. Users can tolerate eventual consistency for rankings.
Secure code execution - This is critical since we’re running arbitrary user code. We need complete isolation to prevent malicious code from affecting our infrastructure.
Sub-5 second submission feedback - Users expect quick feedback on their solutions.
Competition scale of 100,000 concurrent users - During major contests, we need to handle significant traffic spikes without degrading performance.
High availability during competitions - A system outage during a major competition would be catastrophic.
Back-of-the-envelope Estimations
Let’s walk through the numbers to understand our scale:
Daily active users: 50,000 normal days, 100,000 during competitions
Submission patterns: Let’s say each active user makes 20 submissions per day on average. During competitions, this could spike to 50 submissions per user over 90 minutes.
Normal day: 50,000 × 20 = 1M submissions/day = ~12 submissions/second
Competition spike: 100,000 × 50 = 5M submissions over 90 minutes = ~925 submissions/second
Read traffic for problems: Users browse problems frequently. Maybe 10 problem views per active user per day.
Normal: 50,000 × 10 = 500K problem views/day = ~6 views/second
Competition: Could be 2-3x higher as users read competition problems
Leaderboard requests: During competitions, all active users check leaderboard every 30 seconds.
Competition: 100,000 users × 2 requests/minute = ~3,300 requests/second for leaderboard
Storage requirements:
4,000 problems × 50KB average (including all language stubs, test cases) = 200MB for problems
1M submissions/day × 5KB average = 5GB/day submission data
Over 5 years: ~9TB total storage (very manageable)
Bandwidth estimates:
Code submissions: 925 peak submissions/second × 5KB = ~4.6MB/second upload
Leaderboard responses: 3,300 requests/second × 2KB = ~6.6MB/second download
Problem content: 6 views/second × 50KB = 300KB/second
The numbers show this is actually a quite manageable system from a scale perspective, but the complexity comes from the secure code execution requirements.
🔒 System Design
Here’s how I’d think about the system design: