Day 15: CAP Theorem & PACELC | System Design Mastery

Core Concepts

The CAP Trade-off

Eric Brewer's CAP theorem states that a distributed system can provide at most two of three guarantees during a network partition. Understanding which two to choose is the key design decision.

🔄

CAP Triangle

Consistency (all nodes agree), Availability (every request responds), Partition Tolerance (survive network splits). During a real partition — you must choose C or A. P is non-negotiable for distributed systems.

🔒

CP Systems

Consistent during partition but may reject requests. Examples: Zookeeper, HBase, MongoDB (majority write), Redis Cluster (strict mode). Correctness trumps uptime.

🌐

AP Systems

Available during partition but may return stale data. Examples: Cassandra, DynamoDB (defaults), CouchDB, DNS. Eventually consistent — nodes reconcile after partition heals.

⚡

PACELC

Extends CAP: even without partition (normal operation), systems trade Latency vs Consistency. A low-latency read may skip replication wait; a consistent read may wait for quorum acknowledgement.

Simulation 1

Interactive CAP Triangle

Click on each region of the CAP triangle to explore which real-world systems fall into each category, and why you can only ever guarantee two of the three properties simultaneously.

Click a region to explore

The triangle shows the three properties of distributed systems. During a network partition, you must sacrifice either Consistency or Availability — Partition Tolerance is mandatory for any real distributed system.

💡

Why CA is a Myth

CA systems (no partition tolerance) are single-node databases like PostgreSQL on one server. They're not truly distributed. Any system spanning multiple machines will eventually experience network partitions — hardware failures, datacenter splits, and cable cuts happen constantly at scale.

Simulation 2

Network Partition Simulation

Inject a network partition between two datacenters and observe how CP vs AP systems respond to a write request during the split.

Two-Datacenter Cluster

DC1 (Primary)

A

B

Leader resides here

↔

DC2 (Secondary)

C

D

Replicas

System ready. Both DCs connected and healthy.

Section 3

PACELC Deep Dive

PACELC extends CAP by acknowledging that even when no partition occurs (the common case), systems must still trade off between latency and consistency. Most of your design decisions happen in the "else" (E) branch.

📘

PACELC Decoded

Partition → Availability vs Consistency • Else → Latency vs Consistency. Every system is classified by both halves: e.g. PA/EL means "AP during partition, low-latency reads in normal operation."

System	Partition: A or C?	Else: L or C?	Use Case
DynamoDB (eventual)	PA — stays available	EL — low latency reads	E-commerce catalog
DynamoDB (strong)	PC — may reject writes	EC — waits for quorum	Financial records
Cassandra (QUORUM)	PA — stays available	EC — quorum read/write	Balanced workloads
Zookeeper	PC — rejects during partition	EC — fsync before ACK	Distributed locks, config
MySQL single node	CA — no partition possible	EC — ACID transactions	OLTP (no partition risk)

Section 4

Tunable Consistency in Cassandra

Cassandra lets you dial consistency at the query level using the R+W>N formula. This makes it one of the most flexible systems — you choose your trade-off per operation.

Quorum Calculator: R + W > N

N (Total Replicas): 3

W (Write Replicas): 2

R (Read Replicas): 2

R(2) + W(2) = 4 > N(3)

Strong Consistency — guaranteed overlap between reads and writes

python — cassandra_consistency.py

# Cassandra tunable consistency
from cassandra import ConsistencyLevel

# Strong consistency: W=QUORUM + R=QUORUM with N=3
# W(2) + R(2) > N(3) → guaranteed overlap
session.execute(
    "UPDATE accounts SET balance=$1 WHERE id=$2",
    [new_balance, account_id],
    consistency_level=ConsistencyLevel.QUORUM
)

# Read with QUORUM
result = session.execute(
    "SELECT balance FROM accounts WHERE id=$1",
    [account_id],
    consistency_level=ConsistencyLevel.QUORUM
)

# Eventual (AP) — fast but may return stale
result = session.execute(
    "SELECT product_views FROM analytics WHERE id=$1",
    [product_id],
    consistency_level=ConsistencyLevel.ONE
)

Section 5

Technology Decision Guide

Mapping business requirements to CAP/PACELC choices. The wrong choice here causes either data corruption or unnecessary downtime.

CP

Banking & Payments

Use Zookeeper, Google Spanner, or CockroachDB. Correctness over availability — a double-spend is catastrophic. Accept rare write rejections during partitions.

AP

Social Media Likes

Use Cassandra with ONE consistency. Eventual consistency is fine — seeing a count of 10,423 vs 10,424 is not a user-visible bug. Prioritize low latency and high write throughput.

AP

Shopping Cart

Use DynamoDB with eventual reads. Amazon's Dynamo paper pioneered merging conflicting cart versions (union of items). Availability matters more than perfect consistency here.

Tunable

Inventory System

Mixed strategy: strong reads (QUORUM) when actually decrementing stock during purchase; eventual reads for display count. Cassandra handles both with per-query consistency levels.

Knowledge Check

Quiz — 5 Questions

Test your understanding of CAP theorem and PACELC. Select the best answer for each question.

Q1. CAP theorem says "choose 2" — why is CA (no partition tolerance) unrealistic for any distributed system?

Q2. Cassandra with consistency level ONE (write to 1 replica, read from 1) is classified as AP because:

Q3. With N=3 replicas, R=2, W=2: what is the consistency guarantee?

Q4. In PACELC, DynamoDB with eventual consistency is classified EL ("Else Latency") because:

Q5. You are building a distributed lock service (like Zookeeper). Which CAP classification must it be?