Day 18 Exercises — Clocks, Ordering & Causality

Exercise 1🟢 Easy15 min

Tracing Lamport Timestamps Through a 3-Node Message Exchange

Three nodes (A, B, C) exchange messages in the following sequence. Each node starts with a local clock of 0. Trace the Lamport timestamp algorithm (on send: increment local clock; on receive: local = max(local, received) + 1) through this sequence: (1) A sends message m1 to B at A's clock=1. (2) B receives m1 and sends m2 to C. (3) C receives m2. (4) A sends m3 to C (independently, no synchronization with step 2). (5) C receives m3.

Tasks

Trace the exact Lamport timestamp on each event: A sends m1, B receives m1, B sends m2, C receives m2, A sends m3, C receives m3. Show each node's clock value at each step.
After all events, what is C's final local clock value? Is event "C receives m2" guaranteed to have a lower timestamp than "C receives m3"? Why or why not?
Explain the key limitation of Lamport clocks: if event X has timestamp 10 and event Y has timestamp 15, can you conclude X happened-before Y? What can you conclude?
Give a concrete distributed systems example where relying on Lamport timestamps alone for ordering would produce an incorrect result — and state what stronger mechanism (vector clocks or consensus) would be needed instead.

Your Notes

Exercise 2🔴 Medium30 min

Vector Clock Conflict Detection — Concurrent vs Causal Events

A distributed key-value store replicates data across 3 nodes (N1, N2, N3). Each value carries a vector clock [n1, n2, n3] representing how many updates each node has seen. The store uses vector clocks to detect whether two versions of the same key are causally related (one happened-before the other) or concurrent (neither happened-before the other, indicating a conflict). Given four version vectors for key "user:42:profile", determine their relationships.

Tasks

Given versions: VA=[3,1,0], VB=[2,2,0], VC=[3,2,0], VD=[1,0,2] — for each pair (VA vs VB, VA vs VC, VB vs VC, VA vs VD), determine: is one a causal ancestor of the other, or are they concurrent? Show your comparison logic.
Explain the happened-before rule for vector clocks: VC_A < VC_B if and only if every component of VC_A is ≤ the corresponding component of VC_B and at least one component is strictly less. Apply this to each pair above.
For each concurrent pair you identified, explain what the distributed store must do: store both versions (siblings), present them to the application for conflict resolution, or apply a deterministic merge (LWW, add-wins)?
Explain why Amazon DynamoDB (and Dynamo paper) stores all concurrent versions as siblings rather than resolving automatically — and what the application must do to reconcile them during a read.

Your Notes

Exercise 3🔴 Medium35 min

Google Spanner's TrueTime — Externally Consistent Transactions

Google Spanner achieves externally consistent (a stronger form of serializable) distributed transactions across datacenters worldwide without a single coordinator. It accomplishes this using TrueTime — an API that returns a time interval [earliest, latest] rather than a point-in-time, with a guaranteed uncertainty bound (epsilon) of typically 1–7ms. This uncertainty bound is the key to Spanner's commit-wait protocol. Understand how TrueTime makes global ordering possible.

Tasks

Explain what "external consistency" means in terms of transaction ordering: if transaction T1 commits before T2 begins (in real time), T2's commit timestamp must be greater than T1's commit timestamp — why is this stronger than serializability alone?
Describe the TrueTime commit-wait protocol: when a Spanner transaction is ready to commit, it receives a timestamp s = TT.now().latest. It then waits until TT.now().earliest > s before releasing the commit. Why does this wait guarantee external consistency?
Calculate the performance cost: if epsilon (max TrueTime uncertainty) is 7ms, how much latency does commit-wait add to every write transaction? How does Google reduce epsilon to minimize this penalty?
Explain the role of GPS receivers and atomic clocks in Spanner's data centers — why are these used instead of NTP synchronization, and what happens to Spanner's correctness if the atomic clock fails in one datacenter?

Your Notes

Exercise 4🔥 Hard55 min

Distributed Changelog Preserving Causal Order Across 5 Datacenters

A global social platform maintains a distributed activity changelog (user posts, likes, comments) across 5 datacenters (US, EU, APAC-East, APAC-West, SA). Users can write to their nearest datacenter. The business requirement is that causally related events must appear in causal order in the changelog — if a user posts a comment replying to a post, the original post must appear before the reply in every datacenter's changelog, even if the reply was written in a different datacenter 50ms after the post. Total write volume: 2 million events/minute.

Tasks

Explain why physical timestamps alone fail this requirement: give a concrete scenario where a reply has a lower physical timestamp than the post it replies to (hint: clock skew between datacenters).
Design a causal metadata propagation scheme: when a user reads post P in US-DC (tagged with vector clock VP) and writes reply R in EU-DC, how does R carry VP as its causal dependency, and how does EU-DC use this to order R after P?
Design the cross-datacenter replication protocol: when US-DC emits events to EU-DC via Kafka, how does EU-DC's changelog consumer hold back events until all their causal dependencies have been received? Describe the "causal barrier" mechanism.
Calculate the storage overhead of attaching vector clocks (5 datacenters x 8 bytes each = 40 bytes per event) to 2 million events/minute, and propose a compression scheme (e.g., delta encoding, causal compression) to reduce this overhead for high-volume events.

Your Notes