Terragrunt & DRY Pipelines
Terragrunt & DRY Pipelines
Terraform is a powerful IaC engine, but it has a structural blind spot: it offers no native mechanism for keeping the configuration that calls your modules DRY. As soon as you manage three environments (dev, staging, prod) across two regions you find yourself copy-pasting the same backend block, the same provider version, and the same module source into dozens of leaf main.tf files. Terragrunt is the thin orchestration wrapper that solves exactly this: it keeps root config in one place, wires up remote state automatically, expresses inter-stack dependencies declaratively, and lets you run run-all apply to converge an entire environment in topological order. At FAANG scale it is the difference between a 5-file infra monorepo and a 3,000-file copy-paste disaster.
What Terragrunt Actually Is
Terragrunt is a Go binary that wraps terraform. Every Terragrunt command (terragrunt plan, terragrunt apply, terragrunt run-all apply) generates a temporary directory, writes backend and provider configuration into it, then delegates to terraform. Your team does not write Terraform differently — they just stop writing boilerplate. Terragrunt reads terragrunt.hcl files that use HCL2 and a set of Terragrunt-specific blocks: remote_state, dependency, inputs, generate, and include.
The Canonical Repo Layout
The standard Terragrunt repo separates the what (Terraform modules) from the where and how (Terragrunt live configs). A typical three-environment AWS platform layout:
The magic lives in the root terragrunt.hcl. Every leaf stack includes it, which means backend configuration and required providers are written exactly once.
Root terragrunt.hcl — the Single Source of Truth
The key function is path_relative_to_include(). For the stack at prod/eks/terragrunt.hcl it returns prod/eks, so the S3 key becomes prod/eks/terraform.tfstate — unique per stack, zero manual naming, zero risk of collision.
Leaf Stack terragrunt.hcl — Module Calls Without Boilerplate
There is no backend.tf, no provider.tf, no versions.tf. Terragrunt generates all three at run time from the root config. The leaf file contains only what is unique to this stack: the module source, the version pin, and the inputs.
Dependency Wiring and the run-all Command
The dependency block is Terragrunt's most powerful feature. It reads the state of another stack (config_path) and exposes its outputs as a structured object. This replaces ad-hoc terraform_remote_state data sources and makes the dependency graph explicit and machine-readable.
With dependencies declared, you can converge an entire environment with a single command:
run-all builds a directed acyclic graph (DAG) from all dependency blocks it finds in the target directory tree. Independent stacks run in parallel; dependent ones wait. This typically cuts environment-wide apply time by 60–80% compared to sequential execution.
DRY Config with account.hcl and environment.hcl
A production Terragrunt repo typically has two or three levels of shared config files that Terragrunt reads with read_terragrunt_config():
account.hcl— at the environment directory level. Holds the AWS account ID, region, and environment name for that subtree.region.hcl— at a region-level directory if you multi-region. Holds the region string.- Root
terragrunt.hcl— reads both, generates provider and backend for every child automatically.
This means adding a fourth environment (e.g., perf) requires creating one directory, one account.hcl with three values, and copying the leaf terragrunt.hcl files. No backend blocks, no provider blocks, no versions files to touch.
run-all apply against production without a prior run-all plan reviewed in CI. The DAG execution is fast precisely because it is parallel — a bad change can race to completion across multiple stacks before you can interrupt it. Production applies should always be gated on a human-approved plan stored as a CI artifact.
Terragrunt in CI/CD Pipelines
The recommended pipeline pattern uses run-all plan on PR open and run-all apply on merge to main, scoped to the changed environment directory:
tf_version and tg_version in CI. Terragrunt releases are frequent and occasionally introduce behavioural changes. Uncontrolled version drift across developers and CI is a notorious source of "works on my machine" plan divergence. Use a .terraform-version and .terragrunt-version file in the repo root and read them in your pipeline.
Common Production Failure Modes
- Stale mock outputs in plan. If a dependency's mock outputs do not match the real outputs, plans look clean but applies fail. Audit mocks whenever the upstream module changes its output shape.
- Lock contention on run-all apply. Parallel stacks that share a DynamoDB lock table can hit throttling under heavy parallelism. Set
--terragrunt-parallelism 4to bound concurrency in environments with many stacks. - State path collision after directory rename. If you rename
dev/rds/todev/aurora/, Terragrunt generates a new S3 key. The old state is orphaned; the new path starts empty and tries to create everything again. Alwaysterraform state mvor physically rename the S3 key before the directory rename lands. - Generated files committed to git. Terragrunt writes
provider.tfandbackend.tfinto working directories. Add.terragrunt-cache/and anygenerated*.tffiles to.gitignore— committing them breaks the DRY model.
Mastering Terragrunt turns a brittle collection of environment-specific Terraform directories into a coherent, auditable, and fast infrastructure platform. The investment in the root config pays back on day one the second engineer joins.