Day 10 — Week 2

Database Sharding & Partitioning

When a single database can't handle your scale, sharding splits data across multiple nodes. Learn consistent hashing, virtual nodes, and how to minimize data movement when adding capacity.

Consistent Hashing Virtual Nodes Hot Partition Detection Cross-Shard Queries Resharding Strategy

Key Concepts

△

Consistent Hashing

Map keys and nodes onto a circular ring. Each key is owned by the first node clockwise from its hash. Adding/removing a node only remaps ~1/N of keys, not all of them.

🔁

Virtual Nodes (vnodes)

Each physical node occupies multiple positions on the ring (e.g., 150 vnodes per server). This evens out load distribution and makes rebalancing smoother when adding or removing nodes.

🔥

Hot Partition

A shard that receives disproportionate traffic. Common cause: partition key is timestamp (all writes go to latest shard) or a viral user (all their writes go to one shard). Detect via per-shard QPS metrics.

🔗

Cross-Shard Queries

Queries that need data from multiple shards require scatter-gather: send query to all relevant shards, collect results, merge/sort. Expensive — design your partition key to avoid them for hot paths.

Interactive Simulation — Consistent Hash Ring

Watch how adding/removing shards affects key distribution. Notice how consistent hashing minimizes remapping.

Shards: 4

Keys remapped: 0

Most loaded: -

Least loaded: -

Architecture

Client Request

→

Shard Router
consistent hash(key)

→

Shard-1
keys 0-24%

Shard-2
keys 25-49%

Shard-3
keys 50-74%

Shard-4
keys 75-99%

Strategy	Distribution	Resharding	Cross-shard queries
Hash partitioning	Even (ideal key)	Expensive without consistent hash	Scatter-gather all shards
Range partitioning	Uneven if data skewed	Easy (split a range)	Only relevant shards (range scan)
Directory partitioning	Fully custom	Manual but flexible	Depends on directory lookup

Technology Decision

Choose	When
Consistent hashing	Need elastic scaling with minimal data movement (Cassandra, DynamoDB)
Range partitioning	Range scan queries are common (time-series, sorted data, HBase, BigTable)
Application-level routing	Simple setups, you control the shard key explicitly in every query
Vitess / Citus	MySQL/Postgres sharding with transparent routing layer on top

Code Example — Consistent Hash Ring

import hashlib
from sortedcontainers import SortedDict

class ConsistentHashRing:
    def __init__(self, vnodes: int = 150):
        self.vnodes = vnodes
        self.ring: SortedDict = SortedDict()
        self.nodes: set = set()

    def _hash(self, key: str) -> int:
        return int(hashlib.md5(key.encode()).hexdigest(), 16)

    def add_node(self, node: str):
        for i in range(self.vnodes):
            vnode_key = f"{node}:vn{i}"
            h = self._hash(vnode_key)
            self.ring[h] = node
        self.nodes.add(node)

    def remove_node(self, node: str):
        for i in range(self.vnodes):
            vnode_key = f"{node}:vn{i}"
            h = self._hash(vnode_key)
            self.ring.pop(h, None)
        self.nodes.discard(node)

    def get_node(self, key: str) -> str:
        if not self.ring:
            raise ValueError("No nodes in ring")
        h = self._hash(key)
        # Find first node clockwise from hash position
        idx = self.ring.bisect_left(h)
        if idx == len(self.ring):
            idx = 0  # wrap around the ring
        return self.ring.values()[idx]

# Usage
ring = ConsistentHashRing(vnodes=150)
for n in ["shard-1", "shard-2", "shard-3", "shard-4"]:
    ring.add_node(n)

# Route a key to its shard
node = ring.get_node("user:12345")
print(f"user:12345 → {node}")

# Add a node — only ~1/N keys remap
ring.add_node("shard-5")

Quiz

1. In a consistent hash ring with 4 shards, approximately what percentage of keys must be remapped when a 5th shard is added?

2. Why do production consistent hash implementations use virtual nodes (vnodes)?

3. Your system shards by user_id. A celebrity with 50 million followers posts — all write fan-out goes to one shard. This is called:

4. Why is cross-shard JOIN expensive in a sharded database?

5. DynamoDB and Cassandra both use consistent hashing. Which additional technique do they use to handle node failures without remapping?

Database Sharding & Partitioning

Day 10 Complete!