Elastic Load Balancing Deep Dive
Elastic Load Balancing Deep Dive
AWS Elastic Load Balancing (ELB) is the traffic front door for nearly every production system on AWS. It absorbs client connections, performs health checks, and distributes requests across a fleet of targets — all without you managing a single load balancer instance. At big-tech scale, ELB is not just a convenience: it is the component that enables zero-downtime deployments, absorbs traffic spikes, and provides the first line of TLS termination. Understanding its internals is essential for any serious AWS practitioner.
ALB vs NLB: Choosing the Right Tool
AWS offers two primary load balancer types. The Application Load Balancer (ALB) operates at OSI Layer 7 (HTTP/HTTPS/gRPC). It reads the full HTTP request — path, headers, hostname, query strings — and routes based on content. The Network Load Balancer (NLB) operates at Layer 4 (TCP/UDP/TLS). It routes based on IP and port, with no awareness of application-layer content.
- Use ALB when you need host-based or path-based routing, WebSocket support, gRPC, sticky sessions, WAF integration, Cognito authentication, or Lambda targets. This is the default for web applications and microservices.
- Use NLB when you need ultra-low latency (microsecond-level connection passthrough), static IP addresses (NLB provides one Elastic IP per AZ), TCP pass-through for TLS mutual auth end-to-end, or non-HTTP protocols like MQTT, custom TCP, or high-volume UDP (e.g., DNS, gaming).
X-Forwarded-For header on the target to get the real client IP.
Target Groups
A target group is the destination pool that a listener rule routes traffic to. Targets can be EC2 instances, ECS tasks (by IP), Lambda functions, or other load balancers (ALB-behind-NLB pattern). Each target group has an independent health check configuration: protocol, path, port, healthy threshold, unhealthy threshold, and interval.
Target group attributes that matter in production:
- Deregistration delay (default 300 s): how long ELB keeps sending in-flight requests to a target being deregistered. During rolling deployments, lower this to 30–60 s to speed up draining if your requests are short-lived.
- Slow-start mode (ALB only): ramp a new target from 0% to full weight over 30–900 s — prevents cold-start JVM or Node.js instances from being avalanched immediately.
- Load balancing algorithm (ALB only): Round robin (default), Least outstanding requests (better for heterogeneous request durations), or Weighted random with Least Outstanding Requests (new, best for large fleets).
- Stickiness: duration-based (ELB-generated cookie) or application-based (your own cookie). Avoid stickiness unless the application truly requires it — it defeats the purpose of horizontal scaling.
Listeners and Rules
A listener is a port-and-protocol endpoint on the load balancer (e.g., HTTPS:443). It evaluates an ordered list of rules. Each rule has conditions (host header, path pattern, HTTP method, source IP, query string, HTTP headers) and an action (forward to target group, redirect, return a fixed response, authenticate via Cognito/OIDC). The default rule catches everything not matched by earlier rules.
A typical ALB rule setup for a microservices API gateway pattern:
For HTTPS listeners you must attach an ACM (AWS Certificate Manager) certificate. ALB supports multiple certificates on one listener via SNI — the listener selects the certificate matching the Host header, so a single ALB can serve dozens of domains.
TLS Termination
ELB terminates TLS at the load balancer by default. The certificate lives in ACM; you never manage private keys on instances. Connections from ALB to targets travel over your VPC private network. For compliance environments (PCI-DSS, HIPAA) that require encryption all the way to the target, you can:
- Install a certificate on the target and configure the target group protocol as HTTPS (ALB re-encrypts), or
- Use an NLB with TLS listener and forward TCP pass-through to the target (end-to-end encryption, the NLB does not decrypt).
TLS security policy selection matters. Always prefer ELBSecurityPolicy-TLS13-1-2-2021-06 (TLS 1.3 + TLS 1.2, strong ciphers only) for internet-facing ALBs. Avoid the older ELBSecurityPolicy-2016-08 which permits TLS 1.0/1.1 — PCI DSS and SOC 2 auditors will flag it.
Health Checks and Failure Modes
ELB health checks are the mechanism that keeps traffic away from broken targets. A target is marked unhealthy after unhealthyThreshold consecutive failures and healthy again only after healthyThreshold consecutive successes. Common production pitfalls:
- Health check path returning 200 while the app is broken: a shallow
/pingthat always returns 200 will keep an unhealthy target in rotation. Use a deep health check endpoint (/healthz) that verifies DB connectivity, cache reachability, and any critical dependency. - Security group blocking ELB health checks: ALB health checks originate from the ALB nodes themselves within your VPC. The target's security group must allow inbound traffic on the health-check port from the ALB security group (not from the internet).
- Deregistration delay too high: the default 300 s means a rolling deployment waits 5 minutes per batch just draining connections. Tune it to match your p99 request duration.
drop_invalid_header_fields = true in Terraform to prevent HTTP desync (request-smuggling) attacks.
Cross-Zone Load Balancing
By default, each ALB node (one per AZ) only distributes requests to targets registered in its own AZ. With cross-zone load balancing enabled (the default for ALB, optional for NLB), each node distributes requests evenly across all registered targets in all AZs. This eliminates the need to maintain an equal number of targets per AZ — critical when Auto Scaling groups span AZs and instance counts are uneven. NLB charges for cross-AZ data transfer when cross-zone load balancing is enabled; ALB does not.
Mastering ELB — especially the interplay between listener rules, target group health checks, deregistration delays, and TLS policy selection — is what separates engineers who "just get it working" from engineers who build systems that survive real production failures.