System Design, but Simple

System Design, but Simple

Design Uber

Like you should in an interview. Explained as simply as possible… but not simpler.

Sep 15, 2025
∙ Paid
1
Share

In this issue, I walk through the exact thinking I’d use in a system design interview out loud, step by step. Clear, practical, and including trade-offs you can defend.

What you’ll learn in ~15 minutes

  • How I would scope the problem without missing important requirements (custom aliases, expirations, availability).

  • Geohashing and quadtrees comparison for high-frequency location updates (crucial for any proximity-based system design)

  • The distributed locking pattern that prevents double-booking drivers

  • Where to be strict about ACID properties and where eventual consistency is acceptable

How this issue is structured
I split the write-up into the same sections I’d narrate at a whiteboard. Free readers get the full walkthrough up to the deep-dive parts. Paid members get the 🔒 sections.

  • Initial Thoughts & Clarifying Questions

  • Functional Requirements

  • Non-Functional Requirements

  • Back-of-the-envelope Estimations (QPS, storage, bandwidth, cardinality math)

  • 🔒 System Design (the architecture I’d draw and the excalidraw link for it!)

  • 🔒 Component Breakdown (why each piece exists + alternatives)

  • 🔒 Trade-offs Made

  • 🔒 Deep Dives (Location Service Architecture, Consistency in Matching)

  • 🔒 Security & Privacy

  • 🔒 Monitoring, Logging, and Alerting

Quick note: If you’ve been getting value from these and want the full deep dives, becoming a paid member helps me keep writing—and you’ll immediately unlock the 🔒 sections above, plus a few extras I lean on when I practice.

Members also get

  • 12 Back-of-the-Envelope Calculations Every Engineer Should Know

  • My Excalidraw System Design Template — drop-in canvas you can copy and tweak.

  • My System Design Component Library

Let’s get to it!


Initial Thoughts & Clarifying Questions

To begin, I'd want to understand the scope and constraints of what we're building here. Let me ask a few clarifying questions to make sure I'm designing the right system:

What's the core functionality we need to support? I'm assuming we need ride matching between drivers and passengers, fare estimation, and real-time location tracking. Based on typical Uber usage, I'll focus on these core features.

What's our scale? I'm thinking we're looking at millions of daily active users globally, with hundreds of thousands of concurrent rides during peak hours. This suggests we need a highly scalable, distributed system.

What about different vehicle types? For this interview, I'll scope this down to just one vehicle type - let's say UberX - to keep things focused. We can always discuss how to extend this later.

Real-time requirements? I'm assuming we need sub-minute matching times and real-time location updates. Users expect to be matched with a driver quickly, and drivers need accurate navigation.

Geographic scope? I'll design for a global system with regional deployments, but focus on one region for the core design.

From what I understand, we're building a ride-sharing platform where users can request rides, get fare estimates, and be matched with nearby drivers in real-time.

Functional Requirements

From what I understand, the core requirements are:

  1. Fare estimation: Users should be able to input a start location and destination and receive an estimated fare and ETA

  2. Ride requesting: Users should be able to request a ride based on an estimate and be matched with a nearby available driver in real-time

  3. Driver operations: Drivers should be able to accept/deny ride requests and navigate to pickup and drop-off locations

I'm putting several features out of scope to stay focused: multiple car types, driver/rider ratings, scheduling rides in advance, and payment processing. These are important but secondary to the core matching functionality.

Non-Functional Requirements

I'd expect this system to handle several critical non-functional requirements:

Low latency matching: We need to match riders with drivers in under one minute, or return a "no drivers available" message. Users won't wait around forever, and quick matching is core to the user experience.

Consistency of matching: This is crucial - we need to ensure any given ride is only ever matched to one driver, and no driver gets multiple ride requests simultaneously. This is a classic consistency requirement where we can't afford duplicate bookings.

High availability outside of matching: The system should be available 24/7 for fare estimates, driver location updates, and general operations. Only the actual matching process might occasionally fail due to consistency requirements.

High throughput during surges: We need to handle massive spikes during big events - potentially hundreds of thousands of requests within a region in a short time window.

Back-of-the-envelope Estimations

Let me work through some numbers to understand the scale we're dealing with. I know that back-of-envelope calculations should only be done when they directly influence design decisions, and here they will - particularly for our location service.

Active drivers and location updates: Let's say Uber has about 6 million drivers globally, with roughly 3 million active at any given time. If we update driver locations every 5 seconds (which we need for accurate matching), that's:

3 million drivers ÷ 5 seconds = 600,000 location updates per second

This is a massive write load that will definitely influence our database choice.

Daily ride requests: With millions of daily active users, let's estimate 50 million ride requests globally per day. During peak hours (say 4 hours total), this could be 5x higher:

  • Peak: ~175,000 ride requests per hour = ~50 requests per second

  • During major events or surges: Could spike to 10x normal = 500+ requests per second

Storage for ride data: Each ride record might be ~2KB (with all metadata). At 50M rides/day:

  • Daily: 100 GB of new ride data

  • Annual: ~35 TB of ride data

Location data storage: 600K updates/second × 100 bytes per update = 60 MB/second of location data. However, we only need current locations, so storage is manageable - maybe 300 MB total for 3M driver locations.

Our location service needs to handle 600K writes per second while providing sub-second proximity queries. This definitely rules out traditional relational databases and points us toward specialized solutions.

🔒 System Design

Here’s the system design I am thinking of:

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Stephane Moreau
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture