Day 14 of 30 • Capstone Project ★

Ride-Sharing System Design

Design Uber/Lyft at scale: geospatial indexing, real-time matching, surge pricing, and WebSocket driver tracking.

Geospatial WebSocket Redis Geo Matching Engine Surge Pricing

System Requirements

🎯 Functional Requirements

  • Riders request trips from location A → B
  • Real-time driver location tracking (GPS)
  • Nearest driver matching within radius
  • Fare estimation before confirmation
  • Surge pricing based on supply/demand
  • Trip lifecycle: request → match → pickup → dropoff → payment
  • Driver earnings, rider history

📊 Scale Targets

  • 5M+ daily trips (Uber: ~8M)
  • 2M+ active drivers pushing GPS every 4s
  • 500K concurrent trips peak
  • P99 match latency < 2 seconds
  • Location updates: ~500K writes/sec
  • 99.99% uptime (4.4 min/year downtime)
  • Multi-region: 70+ countries

⚡ Non-Functional

  • Low-latency geospatial queries
  • Strong consistency for payments
  • Eventual consistency for location
  • Horizontal scaling per city region
  • Graceful degradation on failures

🚫 Out of Scope

  • Route optimization (maps API)
  • Driver onboarding / KYC
  • ML fraud detection
  • Multi-stop rides
  • Scheduled rides

Core Design Concepts

📍 Geospatial Indexing

Redis GEOADD

Redis Geo uses Geohash encoding to store lat/lng as a sorted set score. GEORADIUS queries return nearby drivers in O(N+log M) time. Each city shard holds ~50K drivers in its geo index.

  • Geohash cell ~150m at precision 6
  • GEODIST for exact distance calc
  • GEO index per city cluster

🔌 Real-Time: WebSocket vs Polling

WebSocket

Drivers maintain persistent WebSocket connections to push GPS every 4 seconds. Riders get WebSocket updates on driver ETA. HTTP polling at 4s would require 2M × 60 req/min = 120M req/min overhead.

  • WebSocket gateway per region
  • Heartbeat every 30s for liveness
  • Reconnect with exponential backoff

⚡ Matching Engine

GEORADIUS

On rider request: query Redis for drivers within 5km radius, filter by availability, rank by ETA (distance / avg_speed), send offer to top N drivers, first accept wins. If no match in 30s, expand radius.

  • Offer timeout: 15s per driver
  • Radius expansion: 1→2→5→10km
  • Parallel offers to top 3 drivers

💰 Surge Pricing

Dynamic Pricing

Surge multiplier = demand / supply in a hex cell. Uber uses H3 hexagonal grid (Uber H3 library). If riders/drivers ratio > 2, surge kicks in. Surge cells communicated to driver app in real-time.

  • H3 resolution 8 (~0.5km² cells)
  • Surge computed every 60s
  • Max surge: 5× (regulatory)
🗺️ Live Driver-Rider Matching Simulation

Blue dots = available drivers. Click "Add Rider" then click the canvas to place a rider (green). Hit "Find Match" to watch the nearest driver animate toward the pickup point.

8
Available Drivers
Match Distance
ETA (min)
1.0×
Surge Multiplier
Est. Fare
Simulation ready. Add a rider to begin.
💰 Surge Pricing Calculator
20
30
12
0.67
Demand/Supply Ratio
1.0×
Surge Multiplier
$12
Final Fare
Normal
Status

System Architecture

Rider App
API Gateway
Trip Service
Matching Engine
Redis Geo
Driver App
WebSocket GW
Location Service
Redis Geo
+
Kafka
Surge Service
Kafka
Payment Service
PostgreSQL

📍 Location Service

Receives GPS pings via WebSocket. Writes to Redis Geo (TTL 30s) and publishes to Kafka location topic. Horizontally scaled by city shard.

🎯 Matching Engine

Stateless workers consuming trip request events from Kafka. Queries Redis Geo → ranks candidates → publishes offers via WebSocket gateway back to driver apps.

🚗 Trip Service

Manages trip state machine: REQUESTED → MATCHED → IN_PROGRESS → COMPLETED. Uses PostgreSQL with optimistic locking for concurrent accept races.

💰 Surge Service

Aggregates supply/demand counts per H3 hexagon cell every 60s using sliding window from Kafka. Computes multiplier and broadcasts to driver apps.

🔌 WebSocket Gateway

Maintains persistent connections to 2M+ drivers. Uses sticky sessions (IP hash). Connection metadata (driver→server mapping) stored in Redis for cross-server messaging.

💳 Payment Service

Charges rider on trip completion. Uses Saga pattern: charge → transfer earnings → update ledger. Stripe/Braintree for payment processing. Idempotency keys for retries.

Data Model

🗄️ PostgreSQL — Trips Table

CREATE TABLE trips (
  id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  rider_id    UUID NOT NULL REFERENCES users(id),
  driver_id   UUID REFERENCES drivers(id),
  status      TEXT NOT NULL DEFAULT 'REQUESTED',
  -- REQUESTED | MATCHED | IN_PROGRESS | COMPLETED | CANCELLED
  pickup_lat  DECIMAL(10,7) NOT NULL,
  pickup_lng  DECIMAL(10,7) NOT NULL,
  dropoff_lat DECIMAL(10,7),
  dropoff_lng DECIMAL(10,7),
  surge_mult  DECIMAL(3,2) DEFAULT 1.00,
  base_fare   DECIMAL(10,2),
  final_fare  DECIMAL(10,2),
  started_at  TIMESTAMPTZ,
  ended_at    TIMESTAMPTZ,
  created_at  TIMESTAMPTZ DEFAULT NOW()
);
-- Index for driver active trip lookup
CREATE INDEX idx_trips_driver_active
  ON trips(driver_id, status)
  WHERE status IN ('MATCHED','IN_PROGRESS');

🔴 Redis — Location + Geo Index

# Driver location index per city
GEOADD nyc:drivers:available \
  -73.9857 40.7484 "driver:abc123"

# Find drivers within 5km of rider
GEORADIUS nyc:drivers:available \
  -73.9921 40.7527 5 km \
  WITHCOORD WITHDIST COUNT 20 ASC

# Driver session info (TTL = 30s)
HSET driver:abc123:session \
  status "available" \
  vehicle "UberX" \
  rating "4.87"
EXPIRE driver:abc123:session 30

# Surge multipliers by H3 cell
SET surge:nyc:8928308280fffff 1.8
EXPIRE surge:nyc:8928308280fffff 120

Implementation: Matching Engine

import redis.asyncio as redis
from dataclasses import dataclass
from typing import Optional
import asyncio, uuid, time

@dataclass
class Driver:
    id: str
    lat: float
    lng: float
    distance_km: float
    rating: float

class MatchingEngine:
    def __init__(self):
        self.r = redis.Redis(host='redis-cluster', decode_responses=True)
        self.OFFER_TIMEOUT = 15  # seconds
        self.RADIUS_STEPS = [1, 2, 5, 10]  # km

    async def find_match(self, trip_id: str, lat: float, lng: float,
                         city: str) -> Optional[Driver]:
        geo_key = f"{city}:drivers:available"

        for radius_km in self.RADIUS_STEPS:
            drivers_raw = await self.r.georadius(
                geo_key, lng, lat, radius_km, unit='km',
                withcoord=True, withdist=True, count=10, sort='ASC'
            )
            if not drivers_raw:
                continue

            # Parse and rank by ETA (distance / avg speed 30 km/h)
            candidates = []
            for name, dist, (d_lng, d_lat) in drivers_raw:
                driver_id = name.split(':')[1]
                info = await self.r.hgetall(f"driver:{driver_id}:session")
                if info.get('status') != 'available':
                    continue
                candidates.append(Driver(
                    id=driver_id, lat=d_lat, lng=d_lng,
                    distance_km=float(dist),
                    rating=float(info.get('rating', 4.5))
                ))

            if candidates:
                return await self._send_offers(trip_id, candidates[:3])

        return None  # No driver found

    async def _send_offers(self, trip_id: str,
                           candidates: list[Driver]) -> Optional[Driver]:
        """Send parallel offers, first accept wins."""
        offer_tasks = [
            self._offer_driver(trip_id, d) for d in candidates
        ]
        done, pending = await asyncio.wait(
            offer_tasks, return_when=asyncio.FIRST_COMPLETED,
            timeout=self.OFFER_TIMEOUT
        )
        for task in pending:
            task.cancel()  # Cancel remaining offers

        if done:
            return done.pop().result()
        return None

Technology Trade-offs

DecisionChosenAlternativeWhy
Location storeRedis GeoPostGIS, MongoDB GeoSub-millisecond reads, TTL expiry, in-memory speed
Driver commsWebSocketLong polling, SSEBidirectional, 2M+ persistent connections, lower overhead
Geo gridUber H3Geohash, S2 cellsHexagons have uniform area, no edge distortion at high lat
Trip DBPostgreSQLDynamoDB, CassandraACID for payment + state machine; known schema
Event busKafkaRabbitMQ, SQS500K GPS events/sec; replay for surge computation
MatchingStateless workersStateful coordinatorHorizontal scale, Kafka consumer groups per city shard

Knowledge Check

1. Uber stores driver locations with Redis GEOADD. What underlying data structure does Redis use?

2. Why does Uber use WebSockets instead of HTTP polling for driver location updates?

3. The matching engine sends offers to the top 3 nearest drivers simultaneously. What problem does this solve?

4. Why does Uber use Uber H3 hexagonal cells for surge pricing instead of rectangles (geohash)?

5. A rider requests a trip, two drivers accept simultaneously (race condition). How does the system handle this?

Day 14 Complete!

You've designed a production-grade ride-sharing system. Next: Distributed Consensus (Raft & Paxos).