Day 3: CDN Design & Edge Computing | System Design Mastery

01 — The Problem CDNs Solve

Speed of light is a hard constraint

A request from Singapore to a US-East origin takes ~180ms round trip — just from physics. A CDN moves the content to a POP (Point of Presence) near the user. That same Singapore user now gets content in ~8ms from a Singapore POP. The CDN doesn't make the network faster — it eliminates the distance.

📥

Origin Pull (On-Demand)

CDN fetches from origin on first request to a POP. Subsequent requests hit cache. Pro: Zero setup, works for dynamic/long-tail content. Con: First request to each POP has full origin latency. Cold POPs = slow first users. Used by: Cloudflare, Fastly, CloudFront.

📤

Origin Push (Pre-warming)

You upload content to all POPs upfront. Pro: Zero cold-start latency — first request at any POP is fast. Con: Must push every update to all POPs explicitly. Only practical for static assets that change rarely. Used by: Akamai NetStorage, AWS S3 + CloudFront.

🏗️

CDN Architecture in practice: two-tier hierarchy

Most enterprise CDNs use a two-tier model: Edge POPs (close to users, ~200+ globally) → Shield/Mid-Tier POPs (regional aggregation, ~20 globally) → Origin. Edge misses go to shield (not origin) — shields have large SSD caches and aggregate requests from many edge nodes, dramatically reducing origin traffic. Cloudflare calls this "Tiered Cache"; Fastly calls it "Shielding".

02 — Cache-Control Headers

The HTTP caching contract

Cache-Control is the mechanism you use to tell browsers and CDNs what to cache, for how long, and under what conditions. Getting this wrong causes: serving stale content after deployments, or caching user-specific pages publicly. Build the header interactively below.

Visibility directives

publicShared caches (CDN) may cache this

privateOnly browser may cache — not CDN

no-storeNever cache, period — for sensitive data

no-cacheRevalidate with server before serving (misleading name!)

Age directives

max-age 0s (off)

s-maxage 0s (off)

stale-while-revalidate=60Serve stale for 60s while revalidating async

stale-if-error=3600Serve stale for 1h if origin is down

immutableContent never changes — skip revalidation

Generated header

Cache-Control: (select options)

What this means

Select directives to see an explanation.

Quick presets

⚠️

The "no-cache" naming trap — it DOES cache

no-cache does NOT mean "don't cache." It means "cache it, but revalidate with the server before serving." If the server returns 304 Not Modified, the cached version is served — saving bandwidth but not the RTT. The directive that truly prevents caching is no-store. This trips up engineers constantly in interviews.

💡

The golden rule for CDN caching strategy

Static assets (JS, CSS, images with hash in URL): Cache-Control: public, max-age=31536000, immutable — cache forever, the URL changes on deploy.
HTML pages: Cache-Control: public, max-age=0, s-maxage=300, stale-while-revalidate=60 — CDN caches for 5 min, browser doesn't cache, stale-while-revalidate ensures zero-latency updates.
API with auth: Cache-Control: private, no-store — never allow CDN to cache user-specific data.

03 — Edge Workers

Running code at the edge — before the origin sees the request

Edge Workers (Cloudflare Workers, Fastly Compute, Lambda@Edge) run JavaScript/Wasm at CDN POPs, intercepting every request. Use cases: A/B testing without origin round-trips, geolocation-based redirects, authentication at the edge, request transformation.

Step 0 — Initial state

User in Singapore requests GET /products/42 from your origin in us-east-1.

Step 1 — Anycast routes to nearest POP

DNS returns the CDN anycast IP. Routing protocols direct the TCP connection to Singapore POP (SIN). Edge Worker begins executing in <1ms.

Step 2 — Edge Worker: Auth check

Worker reads Authorization header. Validates JWT signature using cached public key (no origin call). Invalid token → returns 401 immediately. Origin never touched.

Step 3 — Edge Worker: Geo-based logic

Worker reads CF-IPCountry: "SG". Adds X-Locale: en-SG header. Rewrites URL from /products/42 to /api/products/42?currency=SGD. All in edge JS.

Step 4 — Cache lookup at POP

Worker calls caches.default.match(request). Cache HIT → returns instantly from SIN POP. No origin round-trip (saves 180ms). Cache MISS → continues to origin fetch.

Step 5 — Origin fetch (on miss)

Worker calls fetch(modifiedRequest) → origin in us-east-1. Origin processes, returns response with Cache-Control: s-maxage=300. Worker stores in POP cache.

Step 6 — Response transformation

Worker modifies response before returning to user: strips internal headers, adds CORS headers, injects X-Cache: MISS telemetry. Returns to Singapore user in 195ms total.

Step 7 — Next request (cache warm)

Next Singapore user requests same URL → Cache HIT at SIN POP. Auth validated at edge. Response in <10ms. Origin untouched. This is the CDN + Edge Worker value proposition.

// Cloudflare Worker — geolocation + auth + cache pattern
export default {
  async fetch(request, env) {
    // 1. Auth — validate JWT at edge (no origin call)
    const token = request.headers.get('Authorization')?.slice(7);
    if (!token || !(await verifyJWT(token, env.JWT_PUBLIC_KEY))) {
      return new Response('Unauthorized', { status: 401 });
    }

    // 2. Geo — modify request for user's region
    const country = request.cf?.country ?? 'US';
    const url = new URL(request.url);
    url.searchParams.set('currency', CURRENCY_MAP[country] ?? 'USD');

    // 3. Cache — check POP cache first
    const cacheKey = new Request(url.toString());
    const cache = caches.default;
    let response = await cache.match(cacheKey);

    if (!response) {
      // 4. Origin fetch on cache miss
      response = await fetch(cacheKey);
      // Store in POP cache with same TTL as origin sent
      ctx.waitUntil(cache.put(cacheKey, response.clone()));
    }

    // 5. Add telemetry header before returning
    const modified = new Response(response.body, response);
    modified.headers.set('X-Cache', response ? 'HIT' : 'MISS');
    return modified;
  }
};
    

04 — Anycast Routing

One IP address, hundreds of locations

Anycast assigns the same IP address to multiple physical servers worldwide. BGP routing automatically directs each user to the nearest server advertising that IP. No DNS round-robin, no GeoDNS — the routing happens at the network layer.

🌐

How Anycast Works

CDN announces the same IP block (e.g., 104.16.0.0/12) from all their POPs via BGP. Internet routers pick the shortest AS path to the nearest POP advertising that prefix. User in Tokyo → Tokyo POP. User in Frankfurt → Frankfurt POP. Same IP, different physical destination.

⚡

Why Better Than GeoDNS

GeoDNS resolves to a regional IP based on resolver location — but corporate DNS resolvers are often in different cities. A user in Nairobi using Google's 8.8.8.8 DNS might get routed to London. Anycast routes based on network topology, not resolver location — always finds the true nearest POP.

🛡️

Anycast for DDoS Absorption

Anycast naturally distributes DDoS traffic across all POPs. A 1Tbps volumetric attack aimed at "the CDN's IP" gets spread across 200+ POPs, each absorbing a fraction. Cloudflare's 200+ Tbps network capacity means most DDoS attacks are simply absorbed without impact.

🔄

Failover via BGP Withdrawal

If a POP goes down, it stops announcing the IP prefix. BGP converges in seconds, and traffic re-routes to the next-nearest POP automatically. This is faster than DNS TTL-based failover (which has caching delays) and requires zero application-level logic.

🔬

The Vary header — CDN's cache key complexity

Adding Vary: Accept-Encoding tells the CDN to cache separate versions per encoding (gzip, brotli). Adding Vary: Accept-Language creates separate cache entries per language. Each new Vary dimension multiplies your cache storage and reduces hit rates. Rule: only use Vary for dimensions that actually produce different content. Vary: User-Agent is a cache-busting disaster — thousands of UA strings = near-zero hit rate.

05 — Cache Invalidation at CDN Scale

Purging 200 POPs in under 2 seconds

CDN purge (cache invalidation) is fundamentally different from Redis DEL. You need to propagate a delete to hundreds of geographically distributed nodes, each maintaining their own cache. Race conditions, partial purges, and "stale while purging" are real problems.

🗑️

Single URL Purge

Purge a specific URL: POST /purge with URL. CDN propagates delete to all POPs via control plane. Typically takes 1-5 seconds globally. Some CDNs (Fastly) promise <150ms. Use for: specific content changes, targeted invalidation.

🏷️

Surrogate-Key / Cache Tags

Tag cached objects with logical keys: Surrogate-Key: product-42 category-shoes. Then purge all "product-42" content with one API call — regardless of how many URLs contain that content. Powerful for: invalidating all pages that display a changed product.

💥

Full Cache Purge

Nuke everything: dangerous, slow, affects all users globally for seconds to minutes. Only justified for: major deployments, corrupted CDN state. Better alternative: deploy new asset hash URLs and let old URLs TTL out naturally — zero purge needed.

🏆

The immutable asset strategy — eliminate purges entirely

Hash JS/CSS filenames on deploy: main.a3f8c1d2.js. Old URLs keep serving old cached version (correct). New URL serves new version with Cache-Control: public, max-age=31536000, immutable. HTML page (short TTL or no-cache) always points to new hashed URL. Result: zero CDN purges needed on deploy, perfect cache hit rates, and no stale content serving. This is what Webpack/Vite content-hashing is for.

Quiz — Check Your Understanding

CDN Design

Q1. You're serving a React SPA. HTML at /index.html changes on every deploy. JS/CSS files have content hashes in their names. Which Cache-Control strategy is correct?

Q2. Your API returns user-specific JSON: GET /api/me/cart. You accidentally deploy with Cache-Control: public, max-age=300. What happens?

Q3. What does stale-while-revalidate=60 tell a CDN?