Day 10 — Week 2

Database Sharding & Partitioning

When a single database can't handle your scale, sharding splits data across multiple nodes. Learn consistent hashing, virtual nodes, and how to minimize data movement when adding capacity.

Consistent Hashing Virtual Nodes Hot Partition Detection Cross-Shard Queries Resharding Strategy
Key Concepts
Consistent Hashing
Map keys and nodes onto a circular ring. Each key is owned by the first node clockwise from its hash. Adding/removing a node only remaps ~1/N of keys, not all of them.
🔁
Virtual Nodes (vnodes)
Each physical node occupies multiple positions on the ring (e.g., 150 vnodes per server). This evens out load distribution and makes rebalancing smoother when adding or removing nodes.
🔥
Hot Partition
A shard that receives disproportionate traffic. Common cause: partition key is timestamp (all writes go to latest shard) or a viral user (all their writes go to one shard). Detect via per-shard QPS metrics.
🔗
Cross-Shard Queries
Queries that need data from multiple shards require scatter-gather: send query to all relevant shards, collect results, merge/sort. Expensive — design your partition key to avoid them for hot paths.
Interactive Simulation — Consistent Hash Ring

Watch how adding/removing shards affects key distribution. Notice how consistent hashing minimizes remapping.

Shards: 4
Keys remapped: 0
Most loaded: -
Least loaded: -
Architecture
Client Request
Shard Router
consistent hash(key)
Shard-1
keys 0-24%
/
Shard-2
keys 25-49%
/
Shard-3
keys 50-74%
/
Shard-4
keys 75-99%
StrategyDistributionReshardingCross-shard queries
Hash partitioningEven (ideal key)Expensive without consistent hashScatter-gather all shards
Range partitioningUneven if data skewedEasy (split a range)Only relevant shards (range scan)
Directory partitioningFully customManual but flexibleDepends on directory lookup
Technology Decision
ChooseWhen
Consistent hashingNeed elastic scaling with minimal data movement (Cassandra, DynamoDB)
Range partitioningRange scan queries are common (time-series, sorted data, HBase, BigTable)
Application-level routingSimple setups, you control the shard key explicitly in every query
Vitess / CitusMySQL/Postgres sharding with transparent routing layer on top
Code Example — Consistent Hash Ring
import hashlib
from sortedcontainers import SortedDict

class ConsistentHashRing:
    def __init__(self, vnodes: int = 150):
        self.vnodes = vnodes
        self.ring: SortedDict = SortedDict()
        self.nodes: set = set()

    def _hash(self, key: str) -> int:
        return int(hashlib.md5(key.encode()).hexdigest(), 16)

    def add_node(self, node: str):
        for i in range(self.vnodes):
            vnode_key = f"{node}:vn{i}"
            h = self._hash(vnode_key)
            self.ring[h] = node
        self.nodes.add(node)

    def remove_node(self, node: str):
        for i in range(self.vnodes):
            vnode_key = f"{node}:vn{i}"
            h = self._hash(vnode_key)
            self.ring.pop(h, None)
        self.nodes.discard(node)

    def get_node(self, key: str) -> str:
        if not self.ring:
            raise ValueError("No nodes in ring")
        h = self._hash(key)
        # Find first node clockwise from hash position
        idx = self.ring.bisect_left(h)
        if idx == len(self.ring):
            idx = 0  # wrap around the ring
        return self.ring.values()[idx]

# Usage
ring = ConsistentHashRing(vnodes=150)
for n in ["shard-1", "shard-2", "shard-3", "shard-4"]:
    ring.add_node(n)

# Route a key to its shard
node = ring.get_node("user:12345")
print(f"user:12345 → {node}")

# Add a node — only ~1/N keys remap
ring.add_node("shard-5")
Quiz
1. In a consistent hash ring with 4 shards, approximately what percentage of keys must be remapped when a 5th shard is added?
2. Why do production consistent hash implementations use virtual nodes (vnodes)?
3. Your system shards by user_id. A celebrity with 50 million followers posts — all write fan-out goes to one shard. This is called:
4. Why is cross-shard JOIN expensive in a sharded database?
5. DynamoDB and Cassandra both use consistent hashing. Which additional technique do they use to handle node failures without remapping?