Day 5 Exercises — Rate Limiting & Traffic Shaping

Exercise 1 🟡 Easy ⏱ 15 min

✓ Completed

Rate Limiting Algorithm Selection

A public API serves 3 client types: (A) Background jobs that burst 500 requests at startup then slow to 10 req/sec, (B) Interactive users expecting smooth consistent responses, (C) Potential abusers scraping data as fast as possible. Each type needs a different algorithm.

Algorithm Comparison

Token Bucket: accumulates tokens → allows bursts

|

Leaky Bucket: constant output rate, drops bursts

|

Sliding Window: precise count, no boundary attacks

Concept Check — 3 questions

Q1. For background jobs that burst 500 requests at startup, then slow to 10 req/sec, which algorithm is best?

ALeaky bucket — smooths all output to a constant rate regardless of input bursts

BToken bucket — accumulates tokens over time, allows bursting through the reserve

CFixed window counter — simple to implement for bursty workloads

DNo rate limiting — background jobs should never be limited

Q2. The leaky bucket algorithm guarantees what output characteristic?

AConstant output rate regardless of input burst size

BAllows all bursts to pass through with no rate restriction

CUnlimited throughput with no waiting

DZero latency on all requests including burst traffic

Q3. Sliding window counter is more accurate than fixed window counter because?

AIt processes requests faster due to optimized data structures

BIt uses less memory than fixed windows at high request rates

CIt prevents boundary attacks — no synchronized reset at window edges allows 2× the limit

DIt is simpler to implement and debug

Boundary attack on fixed windows: send 100 requests at 11:59:59 (end of window 1), then 100 at 12:00:01 (start of window 2) = 200 requests in 2 seconds despite a 100 req/min limit. Token bucket Redis implementation: store {tokens, last_refill_timestamp} in Redis, atomically refill and decrement on each request. Tiers: store tier in user metadata → lookup limit → apply algorithm.

Open Design Challenge

1

Explain the boundary attack against fixed windows: if the limit is 100 req/min, how can an attacker get 200 requests in 2 seconds? Show the timing.

2

Implement a token bucket rate limiter using Redis. What data structure do you store per user? Show the pseudocode with INCR and EXPIRE commands.

3

Design rate limit tiers: free (100 req/min), pro (1,000 req/min), enterprise (unlimited). How do you store the tier and check limits efficiently on every request?

Concept score: 0/3

Exercise 2 🟡 Easy ⏱ 20 min

✓ Completed

Distributed Rate Limiting

A company has 10 API servers. Rate limit per user: 1,000 req/min. Each server maintains its own local counter. User A sends 100 requests to each of 10 servers = 1,000 total requests, but each server only sees 100 — all pass the per-server limit of 100. The rate limit is bypassed entirely.

Local vs Centralized Rate Limiting

❌ Local counters (broken)

User → LB → 10 servers each see 100 req → all allow → total 1,000 req bypass limit

✅ Central Redis counter

All 10 servers INCR same Redis key → shared count → 429 when count exceeds limit

Concept Check — 3 questions

Q1. The fix for distributed rate limiting across multiple API servers is?

AIncrease per-server limit to 1,000 so the total across servers equals 10,000

BCentralized Redis counter — all servers check and increment the same Redis key

CUse sticky sessions so one user always goes to the same server

DRoute by IP hash so each user is rate limited by one server

Q2. Redis INCR for rate limiting returns the new count. If count exceeds the limit, what HTTP status code do you return?

A200 OK — silently drop the request without telling the client

B403 Forbidden — the user is unauthorized to make this request

C429 Too Many Requests with Retry-After header indicating when to retry

D503 Service Unavailable — the service is down due to overload

Q3. Redis rate limiting adds ~1ms per request. To reduce this latency overhead, what technique helps?

ADisable rate limiting entirely — 1ms overhead is too high a cost

BRedis pipeline — batch the INCR + EXPIRE commands in one round-trip

CSwitch from Redis to PostgreSQL for rate limit storage

DCache the rate limit decision for 60 seconds to avoid Redis calls

Redis sliding window: ZADD ratelimit:{user_id} {timestamp} {request_id}, ZREMRANGEBYSCORE (remove old entries outside the window), ZCARD (count current requests). Fail open vs closed: financial APIs fail closed (block all on Redis outage); social/content APIs usually fail open (allow all). Priority order: API key > user ID > IP address — API key is the strongest identity signal.

Open Design Challenge

1

Write the Redis command sequence for a sliding window rate limiter: ZADD, ZREMRANGEBYSCORE, ZCARD, and EXPIRE. Show the flow for checking if a request is allowed.

2

What happens if the Redis rate limiter goes down? Should you fail open (allow all) or fail closed (block all)? Justify based on API type.

3

Design rate limiting by API key, user ID, and IP. Which takes priority and why? How do you handle the case where only an IP is known (unauthenticated request)?

Concept score: 0/3

Exercise 3 🔴 Medium ⏱ 25 min

✓ Completed

Rate Limit Headers and Client Communication

Your API returns rate limit info in headers but clients keep hitting 429s. SDK developers implemented exponential backoff incorrectly — they retry only the failed request while continuing to send other requests at full rate, wasting the backoff window entirely.

Self-Throttling with Rate Limit Headers

API response

→

RateLimit-Remaining: 20

→

Client reads headers

→

Self-throttles proactively before hitting 429

Concept Check — 3 questions

Q1. Standard rate limit response headers from the RateLimit header standard include?

AX-API-Key and X-API-Secret

BRateLimit-Limit, RateLimit-Remaining, and RateLimit-Reset

CX-Request-ID and Content-Type

DAuthorization and Cache-Control

Q2. Exponential backoff with jitter means?

ARetry immediately after failure with no waiting period

BWait exactly 2^n seconds between retries with no randomness

CWait 2^n seconds + random jitter to prevent synchronized retries from multiple clients

DStop retrying entirely after the first 429 response

Q3. A client receives a 429 response with "Retry-After: 30". The correct client behavior is?

ARetry only this specific failed request after 30s while continuing to send other requests at full rate

BPause ALL requests for 30 seconds, then resume sending

CSwitch to a different API key to bypass the rate limit

DIgnore the header and retry the request immediately

Proactive throttling: when RateLimit-Remaining drops below 20% of RateLimit-Limit, insert a proportional delay between requests — slow down before hitting the wall. Adaptive rate: track 429 rate; if more than 1% of requests return 429 in the last 60 seconds, halve the sending rate. Test in CI/CD: use a mock server configured to return 429 on every nth request to validate backoff behavior without hitting real limits.

Open Design Challenge

1

Design a client-side rate limiter SDK that reads RateLimit-Remaining and proactively slows down when below 20% remaining. How does it calculate the delay?

2

Implement adaptive backoff: if server returns 429, reduce request rate by 50%; if no 429 for 60s, increase by 10%. Show the state machine.

3

How do you test rate limiting behavior in CI/CD without hitting a real rate limiter? Design the mock server and test scenarios.

Concept score: 0/3

Exercise 4 🔥 Hard ⏱ 30 min

✓ Completed

Designing Against DDoS and Scraping Abuse

A competitor scrapes your product catalog using 10,000 rotating IPs. Standard IP rate limiting fails — each IP makes only 1-2 requests. The scraper generates 500,000 requests/hour vs 50,000 from legitimate users. Your origin servers are overwhelmed and latency spikes for real customers.

IP Rotation Bypasses Simple Rate Limiting

10K rotating IPs

→ each 1-2 req

appears legitimate per-IP

→ total

500K req/hour overwhelms origin

→ need

behavioral fingerprinting

Concept Check — 3 questions

Q1. IP rotation defeats per-IP rate limiting. A better signal for detecting scraping is?

ABlock all IPs from specific countries associated with scraping

BBehavioral fingerprinting: request timing patterns, header analysis, JS execution, sequential IDs

CCAPTCHA on every single API request to filter bots

DReduce global API rate limit to 1 req/sec to stop all scrapers

Q2. A request arrives with Accept: text/html but User-Agent: Python-Requests/2.28. This request is likely?

AA legitimate mobile app — mobile apps sometimes send non-standard headers

BA broken browser — it must have a corrupted User-Agent string

CA scraper — legitimate browsers never use Python-Requests as the User-Agent

DA search engine crawler — these use non-browser User-Agent strings

Q3. Bot mitigation at the CDN/WAF layer (before hitting origin) is better than at the application layer because?

ACDN infrastructure is cheaper to operate than application servers

BRequests blocked at CDN/WAF never reach origin — saving all downstream compute, memory, and DB resources

CWAF rules are faster to write than application middleware

DApplication layer cannot perform rate limiting by IP address

Shadow ban: detect scraper, put in "shadow" tier — return HTTP 200 with empty arrays or fake/delayed data instead of 429. The scraper thinks it's working; you've reduced data exfiltration without tipping them off. JS challenge: Cloudflare and similar services inject a JS puzzle — real browsers solve it in milliseconds; headless browsers fail or take too long. Honeypot: add a product ID (e.g., /api/products/000000) that never appears in the UI but scrapers enumerate sequentially — any IP hitting it gets flagged.

Open Design Challenge

1

Design a "shadow ban" system for scrapers: allow them to continue requesting but return fake or empty data. How do you detect them and what do you return?

2

How does Cloudflare's "Browser Integrity Check" detect headless browsers? What signals distinguish real browsers from automated tools?

3

Design a honeypot endpoint that real users never hit but scrapers enumerate. What URL pattern and what action do you take when a client hits it?

Concept score: 0/3

Day 5 Complete 🎉