Day 14 — Ride-Sharing System Design Capstone

System Requirements

🎯 Functional Requirements

Riders request trips from location A → B
Real-time driver location tracking (GPS)
Nearest driver matching within radius
Fare estimation before confirmation
Surge pricing based on supply/demand
Trip lifecycle: request → match → pickup → dropoff → payment
Driver earnings, rider history

📊 Scale Targets

5M+ daily trips (Uber: ~8M)
2M+ active drivers pushing GPS every 4s
500K concurrent trips peak
P99 match latency < 2 seconds
Location updates: ~500K writes/sec
99.99% uptime (4.4 min/year downtime)
Multi-region: 70+ countries

⚡ Non-Functional

Low-latency geospatial queries
Strong consistency for payments
Eventual consistency for location
Horizontal scaling per city region
Graceful degradation on failures

🚫 Out of Scope

Route optimization (maps API)
Driver onboarding / KYC
ML fraud detection
Multi-stop rides
Scheduled rides

Core Design Concepts

📍 Geospatial Indexing

Redis GEOADD

Redis Geo uses Geohash encoding to store lat/lng as a sorted set score. GEORADIUS queries return nearby drivers in O(N+log M) time. Each city shard holds ~50K drivers in its geo index.

Geohash cell ~150m at precision 6
GEODIST for exact distance calc
GEO index per city cluster

🔌 Real-Time: WebSocket vs Polling

WebSocket

Drivers maintain persistent WebSocket connections to push GPS every 4 seconds. Riders get WebSocket updates on driver ETA. HTTP polling at 4s would require 2M × 60 req/min = 120M req/min overhead.

WebSocket gateway per region
Heartbeat every 30s for liveness
Reconnect with exponential backoff

⚡ Matching Engine

GEORADIUS

On rider request: query Redis for drivers within 5km radius, filter by availability, rank by ETA (distance / avg_speed), send offer to top N drivers, first accept wins. If no match in 30s, expand radius.

Offer timeout: 15s per driver
Radius expansion: 1→2→5→10km
Parallel offers to top 3 drivers

💰 Surge Pricing

Dynamic Pricing

Surge multiplier = demand / supply in a hex cell. Uber uses H3 hexagonal grid (Uber H3 library). If riders/drivers ratio > 2, surge kicks in. Surge cells communicated to driver app in real-time.

H3 resolution 8 (~0.5km² cells)
Surge computed every 60s
Max surge: 5× (regulatory)

🗺️ Live Driver-Rider Matching Simulation

Blue dots = available drivers. Click "Add Rider" then click the canvas to place a rider (green). Hit "Find Match" to watch the nearest driver animate toward the pickup point.

8

Available Drivers

—

Match Distance

—

ETA (min)

1.0×

Surge Multiplier

—

Est. Fare

Simulation ready. Add a rider to begin.

💰 Surge Pricing Calculator

Riders requesting: 20

Drivers available: 30

Base fare ($): 12

0.67

Demand/Supply Ratio

1.0×

Surge Multiplier

$12

Final Fare

Normal

Status

System Architecture

Rider App

→

API Gateway

→

Trip Service

→

Matching Engine

→

Redis Geo

Driver App

→

WebSocket GW

→

Location Service

→

Redis Geo

+

Kafka

Surge Service

←

Kafka

→

Payment Service

→

PostgreSQL

📍 Location Service

Receives GPS pings via WebSocket. Writes to Redis Geo (TTL 30s) and publishes to Kafka location topic. Horizontally scaled by city shard.

🎯 Matching Engine

Stateless workers consuming trip request events from Kafka. Queries Redis Geo → ranks candidates → publishes offers via WebSocket gateway back to driver apps.

🚗 Trip Service

Manages trip state machine: REQUESTED → MATCHED → IN_PROGRESS → COMPLETED. Uses PostgreSQL with optimistic locking for concurrent accept races.

💰 Surge Service

Aggregates supply/demand counts per H3 hexagon cell every 60s using sliding window from Kafka. Computes multiplier and broadcasts to driver apps.

🔌 WebSocket Gateway

Maintains persistent connections to 2M+ drivers. Uses sticky sessions (IP hash). Connection metadata (driver→server mapping) stored in Redis for cross-server messaging.

💳 Payment Service

Charges rider on trip completion. Uses Saga pattern: charge → transfer earnings → update ledger. Stripe/Braintree for payment processing. Idempotency keys for retries.

Data Model

🗄️ PostgreSQL — Trips Table

CREATE TABLE trips (
  id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  rider_id    UUID NOT NULL REFERENCES users(id),
  driver_id   UUID REFERENCES drivers(id),
  status      TEXT NOT NULL DEFAULT 'REQUESTED',
  -- REQUESTED | MATCHED | IN_PROGRESS | COMPLETED | CANCELLED
  pickup_lat  DECIMAL(10,7) NOT NULL,
  pickup_lng  DECIMAL(10,7) NOT NULL,
  dropoff_lat DECIMAL(10,7),
  dropoff_lng DECIMAL(10,7),
  surge_mult  DECIMAL(3,2) DEFAULT 1.00,
  base_fare   DECIMAL(10,2),
  final_fare  DECIMAL(10,2),
  started_at  TIMESTAMPTZ,
  ended_at    TIMESTAMPTZ,
  created_at  TIMESTAMPTZ DEFAULT NOW()
);
-- Index for driver active trip lookup
CREATE INDEX idx_trips_driver_active
  ON trips(driver_id, status)
  WHERE status IN ('MATCHED','IN_PROGRESS');

🔴 Redis — Location + Geo Index

# Driver location index per city
GEOADD nyc:drivers:available \
  -73.9857 40.7484 "driver:abc123"

# Find drivers within 5km of rider
GEORADIUS nyc:drivers:available \
  -73.9921 40.7527 5 km \
  WITHCOORD WITHDIST COUNT 20 ASC

# Driver session info (TTL = 30s)
HSET driver:abc123:session \
  status "available" \
  vehicle "UberX" \
  rating "4.87"
EXPIRE driver:abc123:session 30

# Surge multipliers by H3 cell
SET surge:nyc:8928308280fffff 1.8
EXPIRE surge:nyc:8928308280fffff 120

Implementation: Matching Engine

import redis.asyncio as redis
from dataclasses import dataclass
from typing import Optional
import asyncio, uuid, time

@dataclass
class Driver:
    id: str
    lat: float
    lng: float
    distance_km: float
    rating: float

class MatchingEngine:
    def __init__(self):
        self.r = redis.Redis(host='redis-cluster', decode_responses=True)
        self.OFFER_TIMEOUT = 15  # seconds
        self.RADIUS_STEPS = [1, 2, 5, 10]  # km

    async def find_match(self, trip_id: str, lat: float, lng: float,
                         city: str) -> Optional[Driver]:
        geo_key = f"{city}:drivers:available"

        for radius_km in self.RADIUS_STEPS:
            drivers_raw = await self.r.georadius(
                geo_key, lng, lat, radius_km, unit='km',
                withcoord=True, withdist=True, count=10, sort='ASC'
            )
            if not drivers_raw:
                continue

            # Parse and rank by ETA (distance / avg speed 30 km/h)
            candidates = []
            for name, dist, (d_lng, d_lat) in drivers_raw:
                driver_id = name.split(':')[1]
                info = await self.r.hgetall(f"driver:{driver_id}:session")
                if info.get('status') != 'available':
                    continue
                candidates.append(Driver(
                    id=driver_id, lat=d_lat, lng=d_lng,
                    distance_km=float(dist),
                    rating=float(info.get('rating', 4.5))
                ))

            if candidates:
                return await self._send_offers(trip_id, candidates[:3])

        return None  # No driver found

    async def _send_offers(self, trip_id: str,
                           candidates: list[Driver]) -> Optional[Driver]:
        """Send parallel offers, first accept wins."""
        offer_tasks = [
            self._offer_driver(trip_id, d) for d in candidates
        ]
        done, pending = await asyncio.wait(
            offer_tasks, return_when=asyncio.FIRST_COMPLETED,
            timeout=self.OFFER_TIMEOUT
        )
        for task in pending:
            task.cancel()  # Cancel remaining offers

        if done:
            return done.pop().result()
        return None

Technology Trade-offs

Decision	Chosen	Alternative	Why
Location store	Redis Geo	PostGIS, MongoDB Geo	Sub-millisecond reads, TTL expiry, in-memory speed
Driver comms	WebSocket	Long polling, SSE	Bidirectional, 2M+ persistent connections, lower overhead
Geo grid	Uber H3	Geohash, S2 cells	Hexagons have uniform area, no edge distortion at high lat
Trip DB	PostgreSQL	DynamoDB, Cassandra	ACID for payment + state machine; known schema
Event bus	Kafka	RabbitMQ, SQS	500K GPS events/sec; replay for surge computation
Matching	Stateless workers	Stateful coordinator	Horizontal scale, Kafka consumer groups per city shard

Knowledge Check

1. Uber stores driver locations with Redis GEOADD. What underlying data structure does Redis use?

2. Why does Uber use WebSockets instead of HTTP polling for driver location updates?

3. The matching engine sends offers to the top 3 nearest drivers simultaneously. What problem does this solve?

4. Why does Uber use Uber H3 hexagonal cells for surge pricing instead of rectangles (geohash)?

5. A rider requests a trip, two drivers accept simultaneously (race condition). How does the system handle this?

Day 14 Complete!

You've designed a production-grade ride-sharing system. Next: Distributed Consensus (Raft & Paxos).