Advanced Terraform & IaC Patterns

Testing Terraform

18 min Lesson 6 of 28

Testing Terraform

Infrastructure-as-code that is never tested is a liability disguised as productivity. A broken Terraform module merged to main and applied to production has caused countless outages — misconfigured security groups that open ports to 0.0.0.0/0, VPCs without DNS resolution, RDS instances with no backup retention. The Terraform testing ecosystem now gives you a layered defence: static validation, plan-level assertions, native terraform test, and policy gates in CI — each catching a different class of failure at the earliest possible moment.

Layer 1 — Static Validation

Before any state is touched, two built-in commands eliminate whole categories of errors in seconds:

  • terraform validate — type-checks your HCL, resolves references, and flags unknown attributes or missing required arguments. It runs without credentials and without network access, making it ideal as a pre-commit hook.
  • terraform fmt -check — enforces canonical formatting. Inconsistent indentation in HCL is not just cosmetic; it is a leading cause of review-blocking diffs and merge conflicts. Fail the pipeline if fmt disagrees.
# pre-commit: runs in milliseconds, needs no cloud creds terraform fmt -check -recursive terraform validate
Run validate in your module's own directory as well as in the root that calls it. A module can be syntactically valid in isolation but fail validation when composed with specific variable values.

Layer 2 — Plan-Level Assertions

A plan converts your intent into a concrete diff against real state. Reading plans programmatically surfaces surprises before apply:

  • Output a machine-readable plan with terraform plan -out=tfplan && terraform show -json tfplan > plan.json and pipe it through tools like conftest or a simple jq script.
  • Assert invariants: no resource of type aws_s3_bucket should have acl = "public-read"; no aws_security_group_rule should permit cidr_blocks = ["0.0.0.0/0"] on port 22.
  • At Google and Meta, automated plan analysis runs on every PR, with reviewers receiving a human-readable diff summary generated from the JSON. No reviewer touches infra PRs without seeing the plan artifact.
# Export a structured plan and check it with jq terraform plan -out=tfplan.binary terraform show -json tfplan.binary > tfplan.json # Assert: no public S3 ACLs jq '[.resource_changes[] | select(.type == "aws_s3_bucket") | select(.change.after.acl == "public-read")] | length == 0' tfplan.json

Layer 3 — terraform test (Native Unit & Integration Tests)

Introduced in Terraform 1.6, the terraform test command loads *.tftest.hcl files alongside your module, provisions real infrastructure, runs assertions, and then destroys everything — in a single idempotent command. This is the closest equivalent to unit tests for infrastructure.

terraform test lifecycle terraform test Load *.tftest.hcl + variables apply (real infra provisioned) assert blocks pass / fail destroy (always runs) run block
The terraform test lifecycle: load, apply to real infra, assert, then always destroy.

A test file lives next to your module and declares run blocks. Each run can apply or plan, then evaluate assert conditions using Terraform expressions:

# tests/s3_bucket.tftest.hcl variables { bucket_name = "my-test-bucket-tftest-20240101" environment = "test" } run "bucket_exists_and_is_private" { command = apply # default; use "plan" for faster, cheaper checks assert { condition = aws_s3_bucket.this.bucket == var.bucket_name error_message = "Bucket name mismatch." } assert { condition = aws_s3_bucket_public_access_block.this.block_public_acls == true error_message = "Public ACLs must be blocked on all buckets." } } run "versioning_enabled" { command = plan assert { condition = aws_s3_bucket_versioning.this.versioning_configuration[0].status == "Enabled" error_message = "Versioning must be enabled for compliance." } }

Run the suite with terraform test from the module root. Terraform discovers all *.tftest.hcl files automatically. Use -filter=tests/s3_bucket.tftest.hcl to target a specific file during development.

Use the command = plan mode for assertions that do not require real resources (type checks, computed name formats, variable constraints). Reserve command = apply for behaviours you can only verify after provisioning — outputs, data-source lookups, or cross-resource references. This keeps the test suite fast: a suite of 20 plan-mode tests runs in under 30 seconds, while an apply suite can take minutes and incurs cloud costs.

Layer 4 — Policy Checks in CI

Policy-as-code tools enforce organisational guardrails that cannot be expressed inside a Terraform module because the module author and the policy author are different teams. HashiCorp Sentinel (Terraform Cloud/Enterprise) and the open-source Open Policy Agent (OPA) with conftest both consume the JSON plan and emit pass/fail verdicts.

# policy/deny_public_s3.rego (OPA / conftest) package main deny[msg] { r := input.resource_changes[_] r.type == "aws_s3_bucket" r.change.after.acl == "public-read" msg := sprintf("DENY: bucket %v has public-read ACL", [r.address]) } deny[msg] { r := input.resource_changes[_] r.type == "aws_security_group_rule" r.change.after.cidr_blocks[_] == "0.0.0.0/0" r.change.after.to_port == 22 msg := sprintf("DENY: %v opens SSH to the internet", [r.address]) }
# .github/workflows/terraform-ci.yml (excerpted) jobs: plan-and-policy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Terraform fmt check run: terraform fmt -check -recursive - name: Terraform validate run: | terraform init -backend=false terraform validate - name: Terraform plan run: | terraform plan -out=tfplan.binary terraform show -json tfplan.binary > tfplan.json - name: OPA / conftest policy gate run: conftest test tfplan.json --policy policy/ - name: terraform test (unit) run: terraform test -filter=tests/fast_plan_tests.tftest.hcl
Never run terraform test with command = apply against a shared staging environment. These tests provision and destroy real resources — including ones that might match existing names. Isolate test runs in a dedicated throwaway AWS account or project. At scale, teams use Terratest or the native test framework with per-run randomised resource name suffixes (${random_pet.suffix.id}) to guarantee no collision.

Putting It All Together — The CI Gate

A mature Terraform CI pipeline is a strict sequence of gates, each cheaper and faster than the one that follows it. Only code that passes every gate reaches human review:

  1. fmt + validate — milliseconds, no credentials required.
  2. tflint / checkov / trivy — static analysis for security misconfigurations and deprecated APIs (30–60 seconds).
  3. Plan + policy check — requires cloud read credentials; blocks on any policy violation.
  4. terraform test (plan-mode) — fast assertions on computed values.
  5. terraform test (apply-mode) — full integration tests in isolated accounts; runs on merge to main, not every PR.
Google's internal IaC teams gate on all five layers before any change reaches a production root module. The investment pays back within weeks: a single prevented outage typically saves more engineer-hours than building the entire test suite.