Understand how microservices find each other at runtime. Build a live circuit breaker simulation. Learn the precise difference between API Gateway and Service Mesh — and when each applies.
In a monolith, a function call is a memory address. In microservices, it's a network call to an address that might change at any second — containers restart, autoscalers add instances, deployments roll pods. Static configuration breaks immediately.
Hardcoding 10.0.1.45:8080 into your config breaks the moment a container restarts. In Kubernetes, every pod restart gets a new IP. At Netflix's scale (700+ services), this approach is a deployment nightmare — each service change requires updating every caller.
Assign each service a stable DNS name (user-service.internal). Works natively in Kubernetes where CoreDNS handles it automatically. Limitation: DNS TTL means stale records during rollouts. Not suitable for sub-second failover without careful TTL tuning (set to 5–10s).
A dedicated store that maps service_name → [{ip, port, health}]. Services register on startup and send heartbeats every 10s. Registry deregisters unhealthy instances within 30s. Enables real-time routing decisions without DNS TTL constraints. Consul adds KV store and multi-datacenter support.
Client-side (Eureka): The calling service queries the registry, picks an instance, calls it directly. Gives the client full control over load balancing strategy. Server-side (AWS ALB): Client calls a stable LB endpoint; LB queries the registry. Simpler clients, but adds a network hop and a centralized component.
| Mechanism | Latency | Stale data risk | Kubernetes-native | Multi-datacenter | Best for |
|---|---|---|---|---|---|
| DNS | ~1ms | Medium (TTL) | Yes | Limited | Simple setups, K8s |
| Consul | ~5ms | Low (10s heartbeat) | External | Yes | Multi-DC, health checks |
| Eureka | ~5ms | Low-Medium | External | No | AWS/Spring ecosystem |
| K8s built-in | ~1ms | Very low | Native | No | Single cluster deployments |
Most registries use: check every 10–30s, 3 consecutive failures = remove from registry. This means worst-case 60–90s before a dead instance stops receiving traffic. Use a smaller interval (5s) for latency-sensitive services, but at the cost of higher registry load.
When a downstream service starts failing, a circuit breaker trips OPEN — fast-failing all requests without hitting the downstream. After a cooldown window, it enters HALF-OPEN: one probe request is allowed through. Success closes it; failure reopens it immediately.
Without HALF-OPEN, a circuit that goes OPEN stays open forever (requires manual reset) or toggles wildly between OPEN and CLOSED. HALF-OPEN is the "cautious probe" — it lets you verify the downstream recovered without risking a flood of requests.
An API Gateway is the single entry point for all external traffic. It handles cross-cutting concerns so individual services don't have to. Every major platform uses one: AWS API Gateway, Kong, Nginx, Envoy, or custom-built.
/api/v1/users/* → user-service, /api/v1/orders/* → order-serviceX-User-ID, X-Request-ID), strip sensitive headers, transform payloads for backward compatibilityX-B3-TraceId), latency metrics per endpointRate limiting + JWT auth + upstream routing pattern:
The most common interview mistake: conflating these two. API Gateway handles north-south traffic (external users → your services). Service Mesh handles east-west traffic (service → service internal calls). Different scopes, different deployment models.
| Feature | API Gateway | Service Mesh (Istio/Linkerd) |
|---|---|---|
| Traffic direction | North-south (external → internal) | East-west (service → service) |
| Deployment model | Centralized proxy (single point) | Sidecar proxy per service pod |
| mTLS | Optional (terminate at gateway) | Automatic, zero-config |
| Service discovery | Manual routing rules | Automatic (via control plane) |
| Observability | Request-level metrics at edge | Service-to-service metrics, traces |
| Circuit breaking | Per-upstream config | Automatic per destination rule |
| Overhead | Low (one hop) | Higher (~1-3ms per hop, sidecar CPU) |
| Examples | Kong, Nginx, AWS API GW, Envoy | Istio, Linkerd, Consul Connect |
Use both: API Gateway for external auth, rate limiting, and routing. Service Mesh for internal mTLS, east-west observability, and automatic circuit breaking between services. The sidecar overhead is worth it at 20+ services with compliance requirements.
The right answer depends on your team's complexity, scale, and operational maturity. Here are the decision boundaries that appear in system design interviews.
Examples: Kong, AWS API GW, Nginx, Apigee
Examples: Istio, Linkerd, Consul Connect
Best for: hybrid cloud, multi-region setups
Use: CoreDNS, K8s Services, Ingress controller