Design Uber/Lyft at scale: geospatial indexing, real-time matching, surge pricing, and WebSocket driver tracking.
Redis Geo uses Geohash encoding to store lat/lng as a sorted set score. GEORADIUS queries return nearby drivers in O(N+log M) time. Each city shard holds ~50K drivers in its geo index.
Drivers maintain persistent WebSocket connections to push GPS every 4 seconds. Riders get WebSocket updates on driver ETA. HTTP polling at 4s would require 2M × 60 req/min = 120M req/min overhead.
On rider request: query Redis for drivers within 5km radius, filter by availability, rank by ETA (distance / avg_speed), send offer to top N drivers, first accept wins. If no match in 30s, expand radius.
Surge multiplier = demand / supply in a hex cell. Uber uses H3 hexagonal grid (Uber H3 library). If riders/drivers ratio > 2, surge kicks in. Surge cells communicated to driver app in real-time.
Blue dots = available drivers. Click "Add Rider" then click the canvas to place a rider (green). Hit "Find Match" to watch the nearest driver animate toward the pickup point.
Receives GPS pings via WebSocket. Writes to Redis Geo (TTL 30s) and publishes to Kafka location topic. Horizontally scaled by city shard.
Stateless workers consuming trip request events from Kafka. Queries Redis Geo → ranks candidates → publishes offers via WebSocket gateway back to driver apps.
Manages trip state machine: REQUESTED → MATCHED → IN_PROGRESS → COMPLETED. Uses PostgreSQL with optimistic locking for concurrent accept races.
Aggregates supply/demand counts per H3 hexagon cell every 60s using sliding window from Kafka. Computes multiplier and broadcasts to driver apps.
Maintains persistent connections to 2M+ drivers. Uses sticky sessions (IP hash). Connection metadata (driver→server mapping) stored in Redis for cross-server messaging.
Charges rider on trip completion. Uses Saga pattern: charge → transfer earnings → update ledger. Stripe/Braintree for payment processing. Idempotency keys for retries.
CREATE TABLE trips (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
rider_id UUID NOT NULL REFERENCES users(id),
driver_id UUID REFERENCES drivers(id),
status TEXT NOT NULL DEFAULT 'REQUESTED',
-- REQUESTED | MATCHED | IN_PROGRESS | COMPLETED | CANCELLED
pickup_lat DECIMAL(10,7) NOT NULL,
pickup_lng DECIMAL(10,7) NOT NULL,
dropoff_lat DECIMAL(10,7),
dropoff_lng DECIMAL(10,7),
surge_mult DECIMAL(3,2) DEFAULT 1.00,
base_fare DECIMAL(10,2),
final_fare DECIMAL(10,2),
started_at TIMESTAMPTZ,
ended_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Index for driver active trip lookup
CREATE INDEX idx_trips_driver_active
ON trips(driver_id, status)
WHERE status IN ('MATCHED','IN_PROGRESS');
# Driver location index per city GEOADD nyc:drivers:available \ -73.9857 40.7484 "driver:abc123" # Find drivers within 5km of rider GEORADIUS nyc:drivers:available \ -73.9921 40.7527 5 km \ WITHCOORD WITHDIST COUNT 20 ASC # Driver session info (TTL = 30s) HSET driver:abc123:session \ status "available" \ vehicle "UberX" \ rating "4.87" EXPIRE driver:abc123:session 30 # Surge multipliers by H3 cell SET surge:nyc:8928308280fffff 1.8 EXPIRE surge:nyc:8928308280fffff 120
import redis.asyncio as redis
from dataclasses import dataclass
from typing import Optional
import asyncio, uuid, time
@dataclass
class Driver:
id: str
lat: float
lng: float
distance_km: float
rating: float
class MatchingEngine:
def __init__(self):
self.r = redis.Redis(host='redis-cluster', decode_responses=True)
self.OFFER_TIMEOUT = 15 # seconds
self.RADIUS_STEPS = [1, 2, 5, 10] # km
async def find_match(self, trip_id: str, lat: float, lng: float,
city: str) -> Optional[Driver]:
geo_key = f"{city}:drivers:available"
for radius_km in self.RADIUS_STEPS:
drivers_raw = await self.r.georadius(
geo_key, lng, lat, radius_km, unit='km',
withcoord=True, withdist=True, count=10, sort='ASC'
)
if not drivers_raw:
continue
# Parse and rank by ETA (distance / avg speed 30 km/h)
candidates = []
for name, dist, (d_lng, d_lat) in drivers_raw:
driver_id = name.split(':')[1]
info = await self.r.hgetall(f"driver:{driver_id}:session")
if info.get('status') != 'available':
continue
candidates.append(Driver(
id=driver_id, lat=d_lat, lng=d_lng,
distance_km=float(dist),
rating=float(info.get('rating', 4.5))
))
if candidates:
return await self._send_offers(trip_id, candidates[:3])
return None # No driver found
async def _send_offers(self, trip_id: str,
candidates: list[Driver]) -> Optional[Driver]:
"""Send parallel offers, first accept wins."""
offer_tasks = [
self._offer_driver(trip_id, d) for d in candidates
]
done, pending = await asyncio.wait(
offer_tasks, return_when=asyncio.FIRST_COMPLETED,
timeout=self.OFFER_TIMEOUT
)
for task in pending:
task.cancel() # Cancel remaining offers
if done:
return done.pop().result()
return None
| Decision | Chosen | Alternative | Why |
|---|---|---|---|
| Location store | Redis Geo | PostGIS, MongoDB Geo | Sub-millisecond reads, TTL expiry, in-memory speed |
| Driver comms | WebSocket | Long polling, SSE | Bidirectional, 2M+ persistent connections, lower overhead |
| Geo grid | Uber H3 | Geohash, S2 cells | Hexagons have uniform area, no edge distortion at high lat |
| Trip DB | PostgreSQL | DynamoDB, Cassandra | ACID for payment + state machine; known schema |
| Event bus | Kafka | RabbitMQ, SQS | 500K GPS events/sec; replay for surge computation |
| Matching | Stateless workers | Stateful coordinator | Horizontal scale, Kafka consumer groups per city shard |
1. Uber stores driver locations with Redis GEOADD. What underlying data structure does Redis use?
2. Why does Uber use WebSockets instead of HTTP polling for driver location updates?
3. The matching engine sends offers to the top 3 nearest drivers simultaneously. What problem does this solve?
4. Why does Uber use Uber H3 hexagonal cells for surge pricing instead of rectangles (geohash)?
5. A rider requests a trip, two drivers accept simultaneously (race condition). How does the system handle this?
You've designed a production-grade ride-sharing system. Next: Distributed Consensus (Raft & Paxos).