Day 12 Exercises — Message Queues & Apache Kafka

Exercise 1 🟡 Easy ⏱ 15 min

✓ Completed

Message Queue vs Kafka

An e-commerce platform uses RabbitMQ for order processing. New requirement: the analytics team also needs every order event to update dashboards. Problem: RabbitMQ deletes messages after the order processor acknowledges them — the analytics service can't replay what's already gone. Kafka's immutable log solves this by letting multiple consumer groups read independently.

Queue vs Kafka Architecture

RabbitMQ — message deleted after ack

Producer

↓

Queue (message)

↓ consumed + deleted

Order Processor only

Kafka — immutable log, multiple consumers

Producer

↓

Topic (immutable log)

↓ independent offsets

Group A: Order Processor

Group B: Analytics

Concept Check — 3 questions

Q1. Kafka topic vs RabbitMQ queue: the key architectural difference?

AKafka is inherently faster than RabbitMQ for all workloads

BKafka persists messages as an immutable log — multiple consumer groups read independently at their own offsets; RabbitMQ deletes messages after acknowledgement

CRabbitMQ supports more network protocols than Kafka

DKafka requires a SQL schema while RabbitMQ accepts any message format

Q2. In Kafka, adding a second consumer group for analytics?

AHas zero impact on the existing order processing consumer — both groups maintain independent offsets on the same topic

BSlows down the existing order processing consumer

CRequires duplicating the topic to create a separate copy

DConflicts with the existing consumer's offset commits

Q3. RabbitMQ is better than Kafka when?

AYou need high throughput at millions of events per second

BYou need event replay and long-term message retention

CYou need complex routing logic, per-message TTL, or simple task queue semantics with ack/nack-based routing

DYou need long-term data retention for compliance

Kafka's log is immutable — messages are appended and retained for a configurable period (days, weeks, forever). Each consumer group tracks its own offset independently. Adding Group B for analytics has zero effect on Group A's offset or throughput. RabbitMQ excels at: topic exchanges with routing keys, per-message TTL (messages expire if unprocessed), dead-letter queues with nack routing, and simple worker queue patterns where each message has exactly one consumer.

Open Design Challenge

1

Design the Kafka topic structure for an e-commerce platform. What topics would you create? How many partitions for each? Justify based on expected throughput.

2

The analytics team wants to replay the last 7 days of order events to rebuild their dashboard after a bug. How does Kafka enable this? What is the consumer group offset reset command?

3

Design a scenario where RabbitMQ is the better choice over Kafka. What routing rules would you configure?

Concept score: 0/3

Exercise 2 🟡 Easy ⏱ 20 min

✓ Completed

Kafka Partitions and Ordering

A payment processing system sends payment events to Kafka. Business requirement: all events for the same payment_id must be processed in order (create → authorize → capture → refund). With 6 partitions and round-robin assignment, events for the same payment go to different partitions — ordering is broken and "refund before capture" is possible.

Round-Robin vs Key-Based Partitioning

❌ ROUND-ROBIN (ordering broken)

payment#123 create → P0

payment#123 authorize → P3

payment#123 capture → P1

✅ KEYED (same key = same partition)

payment#123 create → P2

payment#123 authorize → P2

payment#123 capture → P2 ✓ ordered

Concept Check — 3 questions

Q1. To guarantee ordering for a specific payment_id, Kafka messages should?

AUse round-robin partitioning for even load distribution

BUse payment_id as the message key — Kafka routes all messages with the same key to the same partition, guaranteeing order

CUse a single partition for the entire topic

DUse event timestamps to sort at the consumer

Q2. Kafka's ordering guarantee is?

AGlobal ordering across all partitions in a topic

BOrdering within the same consumer instance only

CWithin a partition only — messages across different partitions have no guaranteed ordering

DOnly when using Kafka transactions

Q3. Increasing Kafka partition count from 6 to 12 allows?

AStoring larger individual messages in each partition

BMore parallelism — a consumer group can have up to 12 consumers processing in parallel, one per partition

CCross-partition ordering guarantees

DBetter compression ratios for messages

Kafka's partitioning formula: partition = hash(key) % numPartitions. Same key always maps to the same partition (assuming partition count doesn't change). Kafka guarantees ordering within a partition but NOT across partitions. The max parallelism of a consumer group = number of partitions — having more consumers than partitions means some consumers are idle. Increasing partitions: can be done but is irreversible and may break key-based ordering for existing keys.

Open Design Challenge

1

You have 6 partitions and 10 million unique payment IDs. How does Kafka distribute these 10M keys across 6 partitions? Is the distribution perfectly even?

2

A customer has 1000× more transactions than average users — their payment_id maps to P3, creating a hot partition. How do you fix this without breaking ordering?

3

Design the consumer-side logic to process payment events in order. What state machine does the consumer maintain?

Concept score: 0/3

Exercise 3 🔴 Medium ⏱ 25 min

✓ Completed

Consumer Groups and Rebalancing

6 Kafka partitions, 4 consumers in a group — each consumer handles ~1.5 partitions. Consumer #3 crashes mid-processing. Kafka triggers a rebalance: all consumers pause for ~2 seconds while partitions are reassigned. During this window, no messages are processed — a 2-second processing gap every time any consumer fails or is deployed.

Consumer Group Rebalance

Consumer 1 (P0, P1)

Consumer 2 (P2, P3)

Consumer 3 (P4, P5)

Consumer 4 (idle)

→ C3 crashes →

ALL PAUSE
rebalance triggered

→

Consumer 1 (P0, P1)

Consumer 2 (P2, P3)

Consumer 4 (P4, P5)

Concept Check — 3 questions

Q1. During consumer group rebalance, what happens to message processing?

AProcessing continues at half the rate on the remaining consumers

BAll consumers in the group pause completely — no messages are processed during the rebalance period

COnly the failed consumer's partitions pause; others continue

DMessages are permanently lost during the rebalance

Q2. Cooperative (incremental) rebalancing vs eager rebalancing: what is the cooperative advantage?

ACooperative rebalancing completes faster than eager rebalancing

BCooperative rebalancing uses less memory per consumer

COnly the partitions that need to move are revoked — healthy consumers keep processing their unchanged partitions during the reassignment

DThere is no practical difference between the two strategies

Q3. A consumer processes a message but crashes before committing the offset. What happens?

AThe message is permanently lost — Kafka only delivers it once

BThe message is redelivered to another consumer after the rebalance — Kafka considers a message processed only when its offset is committed

CThe message is moved to a dead-letter queue automatically

DKafka automatically commits the offset after a timeout

Eager rebalancing (the default before Kafka 2.4): ALL consumers stop-the-world, revoke all partitions, then redistribute. Cooperative rebalancing (Kafka 2.4+, incremental): only partitions that are moving are revoked — consumers keeping their existing partitions continue processing. To enable: partition.assignment.strategy=CooperativeStickyAssignor. Offset commits: Kafka tracks consumer progress via committed offsets in an internal topic (__consumer_offsets). An uncommitted offset means the message will be redelivered — this is at-least-once delivery semantics.

Open Design Challenge

1

Design a consumer that processes payment events with at-least-once delivery. What must you do before committing the offset? Where do you place the commit call?

2

A rolling deployment restarts all 4 consumers one by one. With eager rebalancing, this causes 4 rebalances × 2 seconds = 8 seconds of downtime. How does cooperative rebalancing improve this?

3

Design a dead-letter queue (DLQ) strategy for Kafka. When a message fails processing 3 times, where does it go and how do you alert on it?

Concept score: 0/3

Exercise 4 🔥 Hard ⏱ 30 min

✓ Completed

Exactly-Once Semantics

A payment processor reads from Kafka and writes to a database. Without exactly-once: message is processed, DB write succeeds, but the Kafka offset commit fails → message is redelivered → payment is processed twice (double charge). With idempotency keys: the DB write is safe on replay. With Kafka transactions: atomically commit the offset and the DB write together.

Exactly-Once with Idempotency

Read from Kafka
partition + offset

→

Process payment

→

DB write
idempotency_key = offset

→

Commit offset
atomically

→

Crash & replay safe
dup key = skip

Concept Check — 3 questions

Q1. Kafka exactly-once semantics (EOS) requires?

ASynchronous replication to all Kafka brokers before acknowledging

BIdempotent producers (enable.idempotence=true) combined with the transactional API to atomically commit offset and external write

CUsing a single partition only for the entire topic

DDisabling consumer groups and using direct partition assignment

Q2. An idempotency key for payment processing should be?

AA random UUID generated fresh for each processing attempt

BThe user's account ID combined with timestamp

CA stable unique identifier — combining Kafka topic+partition+offset guarantees uniqueness and stability across retries

DThe wall-clock timestamp of when processing started

Q3. Kafka transactions (begin/commit/abort) guarantee atomicity across?

AKafka writes and PostgreSQL writes simultaneously in a distributed transaction

BMultiple Kafka topic writes AND consumer offset commits atomically — either all succeed or all are rolled back

CWrites within a single partition only

DSynchronizing state between two separate Kafka clusters

Kafka EOS = idempotent producer + transactions. Idempotent producer: assigns a sequence number to each message — brokers deduplicate retries. Transactional producer: producer.beginTransaction() → produce to multiple topics → sendOffsetsToTransaction() → commitTransaction(). All operations succeed atomically or are rolled back. For external DB writes (PostgreSQL), Kafka transactions don't help — use an idempotency key in the DB with a UNIQUE constraint so duplicate inserts fail gracefully (ON CONFLICT DO NOTHING).

Open Design Challenge

1

Design the payment processor code flow: read from Kafka → process → write to DB with idempotency key → commit offset. What happens at each step if the process crashes?

2

The idempotency table in PostgreSQL grows unbounded. How do you expire old idempotency keys? What is the safe retention window?

3

Design an outbox pattern as an alternative to Kafka transactions: write to DB and outbox table in one local transaction, then publish from outbox to Kafka. How does this guarantee exactly-once?

Concept score: 0/3

Day 12 Complete 🎉