Security Groups & NACLs
Security Groups & NACLs
AWS gives you two overlapping, complementary firewall primitives: Security Groups (SGs) and Network Access Control Lists (NACLs). Understanding the exact difference — stateful vs stateless, where each is enforced, and how they compose — is one of the most frequently tested gaps in production AWS incidents. Get this model right and you will never again wonder why a port is mysteriously open or why an allow rule seems to do nothing.
Stateful vs Stateless Filtering
The single most important distinction is connection tracking.
Security Groups are stateful. When you allow inbound TCP/443, the SG automatically permits the corresponding return traffic (the ACK/response packets) even if there is no explicit outbound rule. The kernel's connection tracking table handles this. In practice: you write the fewest rules, and they express intent, not packet-level mechanics.
NACLs are stateless. They evaluate every packet independently with no memory of prior packets. If you allow inbound TCP/443, you must also explicitly allow outbound TCP in the ephemeral port range (1024–65535) for the responses to leave the subnet. Forgetting this is the number-one NACL misconfiguration in production.
Layered Security Architecture
The diagram below shows how the two layers stack for a typical three-tier application inside a VPC.
Security Group Rules in Depth
SGs are allow-only — there is no explicit deny. All inbound traffic is denied by default; all outbound traffic is allowed by default. Rules can reference other SG IDs instead of CIDR blocks, which is the idiomatic AWS pattern for east-west service-to-service communication.
sg-app in sg-db's inbound rules, any new instance that joins sg-app is automatically allowed — you never chase subnet ranges as you scale. This pattern is standard at production scale.
NACL Rules in Depth
NACLs process rules in ascending number order and stop at the first match (like ACLs on a traditional router). There is always an implicit * DENY ALL at the bottom. Rules go up to 32766; convention is to space them in multiples of 100 so you can insert rules later. Each NACL is associated with one or more subnets.
When to Use Each Layer
In a well-designed AWS account, both layers run simultaneously but serve different purposes:
- Security Groups — your primary, fine-grained control. Use SG-ID references for all east-west traffic. Apply least-privilege egress (restrict outbound by port and destination). Audit with AWS Config rule
restricted-sshandvpc-sg-open-only-to-authorized-ports. - NACLs — a coarse subnet-level backstop. Use them to enforce hard organizational policies: block an entire CIDR that has been compromised, enforce that the DB subnet never talks to the internet, or explicitly deny a rogue range even if an SG rule were mistakenly added. Many teams run NACLs in "allow all" (default) and only tighten them for compliance or incident response.
Evaluating Rules: The Traffic Flow
For a packet traveling from the internet to an RDS instance, the evaluation order is:
- IGW forwards the packet into the VPC.
- NACL-Public (inbound) — is TCP/443 allowed inbound to the public subnet? Yes → continue.
- sg-alb (inbound) — is TCP/443 allowed for this ENI? Yes → ALB receives the packet.
- ALB terminates TLS, opens a new TCP connection to the app tier.
- NACL-Public (outbound) — is TCP/8080 allowed outbound from the public subnet? (stateless — must be explicit) → Yes → continue.
- NACL-Private (inbound) — is TCP/8080 from 10.0.1.0/24 allowed inbound? Yes → continue.
- sg-app (inbound) — is TCP/8080 from sg-alb allowed? Yes → app receives the request.
Each direction at each boundary is an independent check. Miscounting one of these steps is why "I opened the port but it still does not work" is the most common support ticket on AWS.
Production Best Practices
- Name every SG with a clear convention:
sg-{env}-{tier}, e.g.sg-prod-app. Unnamed SGs are a compliance nightmare. - Never use
0.0.0.0/0in inbound SG rules except for public-facing load balancers (port 443/80 only). - Enable VPC Flow Logs (action: ALL) on every production VPC. Flow logs record which NACL/SG decisions were made and are essential for forensics. Ship them to CloudWatch Logs or S3, then query with Athena.
- Use AWS Network Firewall or Gateway Load Balancer with a third-party IDS when you need deep-packet inspection or geo-blocking beyond what NACLs support.
- Periodically run
aws ec2 describe-security-groups --filters Name=ip-permission.cidr,Values=0.0.0.0/0in every region to enumerate overly permissive SGs.