We are still cooking the magic in the way!
Container & IaC Scanning
Container & IaC Scanning
A Dockerfile is not just a build script — it is a security boundary specification. Every FROM line inherits an attack surface. Every RUN apt-get install pins a package version that will drift into CVE territory the day you stop rebuilding. Infrastructure-as-Code files (Terraform, Kubernetes manifests, Helm charts, CloudFormation) carry a parallel risk: a misconfigured privileged: true field or a publicly-exposed S3 bucket defined in HCL is just as dangerous as a vulnerable library. This lesson teaches you to catch both classes of defect in CI — before the image or the resource ever reaches a real environment.
Image CVE Scanning: How It Actually Works
Container image scanners operate by extracting the software inventory from a layered filesystem — OS packages from /var/lib/dpkg/status or /var/lib/rpm, language packages from package-lock.json, go.sum, Pipfile.lock, and so on — and matching that inventory against vulnerability databases: NVD, GitHub Advisory Database, RedHat advisories, and distro-specific feeds. The two dominant open-source tools are Trivy (by Aqua Security) and Grype (by Anchore). Both support scanning images by name, by tarball, or directly from a filesystem — which matters because you want to scan in CI before the image is pushed, not after.
Trivy is the de-facto standard in modern pipelines. It is a single binary with no daemon, handles container images, filesystems, git repos, and IaC files all in one tool, and emits SARIF output that integrates with GitHub Advanced Security and GitLab Security Dashboards.
--exit-code 1 is what makes the scanner a gate rather than a reporter. Without it, Trivy will list every CVE and exit 0 — your pipeline looks green while shipping a critical RCE. Always set an explicit exit code; set the severity threshold to match your organization's SLA (typically HIGH,CRITICAL for blocking, MEDIUM for warning-only).
Base Image Strategy: The Biggest Lever
The fastest way to eliminate CVEs is to shrink the base image. A typical ubuntu:22.04-based image ships 200–400 OS packages. A gcr.io/distroless/base-debian12 image ships roughly 20. A scratch-based Go binary ships zero OS packages. At Google, distroless images are the default for all production workloads — not because they are trendy but because the attack surface reduction is dramatic and measurable.
The standard multi-stage Dockerfile pattern achieves this without sacrificing developer ergonomics: build in a full SDK image, copy only the compiled artifact into a minimal runtime image.
FROM golang:1.22-alpine is mutable — the tag can be silently rewritten to a different (potentially vulnerable) image. Use FROM golang:1.22-alpine@sha256:<digest> in production Dockerfiles. Dependabot and Renovate both handle digest-pinned base image updates automatically.
IaC Scanning: Catching Misconfigs Before Apply
Terraform, Kubernetes manifests, Helm charts, and CloudFormation templates encode your security posture as code. A privileged: true in a DaemonSet, an S3 bucket with block_public_acls = false, or a security group rule with cidr_blocks = ["0.0.0.0/0"] on port 22 is a misconfiguration that will reach production the moment someone runs terraform apply or kubectl apply — unless you scan it first.
The two leading IaC scanners are Checkov (by Bridgecrew/Prisma Cloud) and Trivy's built-in config scanner. Both support Terraform HCL, Kubernetes YAML, Helm, Dockerfile, CloudFormation, and ARM templates. For fine-grained, policy-as-code control, OPA Conftest lets you write custom Rego policies against any structured config file.
The Scanning Architecture in CI
At production scale, image and IaC scanning jobs run in parallel with other CI stages to avoid adding latency to the critical path. The standard layout for a GitHub Actions pipeline:
Common Critical Findings and How to Fix Them
Understanding the most frequent high-severity findings helps you prioritize remediation. These are the patterns that consistently appear across large engineering organizations:
- Running as root — the default in almost every Dockerfile. Fix: add
USER 1001:1001or useUSER nonrootfrom distroless. Kubernetes: enforce with a PodSecurity admission policy (runAsNonRoot: true). - Writable root filesystem — allows an attacker who achieves code execution to persist changes. Fix:
readOnlyRootFilesystem: truein the container'ssecurityContext, with explicitemptyDirmounts for writable paths the app needs. - Privileged container — equivalent to root on the host node. This should never appear outside very specific CNI/storage plugins. Zero tolerance:
privileged: falsemust be policy, not a reminder. - Missing resource limits — not a CVE but a Checkov HIGH: an unbounded container can OOM-kill its neighbors. Always set
resources.limits.cpuandresources.limits.memory. - Outdated OS packages in base image — the most common CVE source. Fix: rebuild from a current base weekly via automated Dependabot/Renovate PRs, or use a distroless image to eliminate the OS layer entirely.
.trivyignore entry must have an expiry date (exp:YYYY-MM-DD) and a linked ticket. Without expiry, suppressed CVEs accumulate silently. A common incident pattern: a team suppresses a CVE pending a base image upgrade, the upgrade is deprioritized, the CVE is exploited six months later — and the post-mortem discovers the scanner had been reporting it green the whole time.
Scanning in Admission Control: The Last Wall
CI scanning gates are necessary but not sufficient. Engineers can push images via CLI, CI pipelines can be bypassed, and third-party images may enter your cluster from Helm charts. The defense-in-depth approach adds admission-time scanning via Kyverno or OPA Gatekeeper policies that check image provenance and scan status before pods are scheduled.
A Kyverno policy that blocks images not scanned by Trivy in the last 24 hours (using an attestation annotation set by CI) looks like this:
At scale, teams integrate with a container registry that supports continuous background scanning — Docker Hub, ECR, GCR, and Artifact Registry all offer this. The registry scans every image on push and re-scans daily against fresh CVE feeds, so even images that passed CI yesterday are flagged if a new critical advisory drops overnight. The runtime posture stays current without requiring a new build.
The full scanning posture — CI gate blocking on build, IaC gate blocking on plan, admission control blocking on schedule, and registry scanning flagging on new CVEs — closes the window from image creation to deployment to runtime without any single point of failure. Each layer catches what the previous layer missed.