Advanced Docker & Container Security

Image Scanning & Vulnerabilities

18 min Lesson 4 of 28

Image Scanning & Vulnerabilities

Every container image you ship is a software supply-chain artifact. It contains an OS layer, system libraries, language runtimes, and your application dependencies — any of which may carry known security vulnerabilities (CVEs). At big-tech scale, security teams routinely block deployments when critical CVEs are present, and on-call engineers get paged when a zero-day appears in a base image that is already running in production.

This lesson covers the two scanners most widely used in professional pipelines — Trivy and Grype — how to read and triage their output, how to decide which CVEs actually matter, and the base-image update strategy that keeps your fleet clean without breaking production.

Why Image Scanning Belongs in CI, Not Just Pre-Deployment

The instinct is to scan once before you push. The reality is that a CVE published on any given Tuesday can affect images you built and shipped months ago. Big-tech security posture requires:

Shift-left scanning — fail the build in CI when new Critical/High CVEs appear in your image.
Continuous registry scanning — nightly re-scans of every tag in production, because CVE databases are updated daily.
Automated rebases — triggered when the upstream base image publishes a patched digest.

Scanning only at build time without continuous re-scanning is security theatre. Your image is clean on Monday and vulnerable on Thursday when the NVD publishes a new advisory for OpenSSL.

Trivy — Industry Standard Scanner

Trivy (by Aqua Security) is the de facto standard in most pipelines. It scans OS packages (Alpine apk, Debian apt, RHEL rpm), language ecosystem files (package-lock.json, go.sum, requirements.txt, Gemfile.lock, pom.xml), and can also lint Dockerfiles and Kubernetes manifests for misconfigurations.

# Install Trivy (macOS)
brew install trivy

# Install Trivy (Linux — one-liner via official script)
curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sh -s -- -b /usr/local/bin

# Scan a local image — table output, all severities
trivy image myapp:latest

# Scan a remote registry image
trivy image nginx:1.27-alpine

# Only report Critical and High, exit non-zero if any found (CI gate)
trivy image --severity CRITICAL,HIGH --exit-code 1 myapp:latest

# Scan a tarball (useful for air-gapped CI runners)
docker save myapp:latest | trivy image --input -

# Output as SARIF for GitHub Advanced Security
trivy image --format sarif --output trivy-results.sarif myapp:latest

# Scan filesystem (great for scanning source code without building)
trivy fs --severity CRITICAL,HIGH ./

Grype — The Anchore Alternative

Grype (by Anchore) is Trivy's main rival. Both consume the same CVE databases (NVD, GitHub Advisory, OS-vendor advisories), but Grype pairs with Syft to generate a reusable SBOM (Software Bill of Materials) that can be scanned by any compatible tool — a key requirement in regulated environments and US federal procurement (NIST SSDF, EO 14028).

# Install Grype
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin

# Basic scan
grype myapp:latest

# Only Critical/High, exit non-zero on findings
grype --fail-on high myapp:latest

# Generate SBOM with Syft, then scan the SBOM with Grype (decoupled approach)
syft myapp:latest -o spdx-json > sbom.json
grype sbom:sbom.json --fail-on critical

# Grype also reads OCI tarballs
grype docker-archive:myapp.tar

SBOM-first scanning is the emerging standard. Generate the SBOM once at build time (attach it to the image with syft attest or store it in your registry), then re-scan that SBOM nightly without rebuilding the image. This is dramatically cheaper than re-pulling and re-scanning multi-GB images every night.

Reading Scanner Output — CVE Triage

A raw Trivy scan of a real production image will surface dozens to hundreds of findings. Not all are equal. The professional triage workflow uses four filters:

Severity — CRITICAL and HIGH are mandatory; MEDIUM are tracked; LOW and UNKNOWN are almost never actioned unless the library is directly invoked by your code.
Fixed version available — if no fix exists, you cannot patch it by upgrading; you can only mitigate by removing the component or accept the risk. CVEs with no fix available are triage as "monitor" rather than "block".
Reachability — is the vulnerable code path actually invoked by your application? Trivy and Grype do static analysis, not dynamic analysis. A vulnerable XML parser in a package you use only for JSON processing is low actual risk.
CVSS vs EPSS — CVSS scores are assigned severity labels (Critical = 9.0–10.0). EPSS (Exploit Prediction Scoring System) gives the probability of exploitation in the wild in the next 30 days. A CVSS 9.8 with EPSS 0.001 is far lower priority than a CVSS 7.5 with EPSS 0.45. Grype now ships EPSS data; Trivy supports it via --vex.

CVE triage decision flow: severity, fix availability, and EPSS exploit probability together determine the response urgency.

Integrating Trivy into a GitHub Actions Pipeline

The most common production pattern is a two-step job: build the image, scan it, and fail if Critical/High vulnerabilities are found. Attaching the SARIF report to GitHub Advanced Security gives your security team a searchable, unified CVE dashboard across every repo.

name: Build & Scan

on:
  push:
    branches: [main]
  pull_request:

jobs:
  build-and-scan:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      security-events: write   # required for SARIF upload

    steps:
      - uses: actions/checkout@v4

      - name: Build image
        run: docker build -t myapp:${{ github.sha }} .

      # Trivy official action — caches DB between runs
      - name: Run Trivy vulnerability scanner
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: myapp:${{ github.sha }}
          format: sarif
          output: trivy-results.sarif
          severity: CRITICAL,HIGH
          exit-code: 1          # fails the job on findings

      - name: Upload SARIF to GitHub Security tab
        if: always()            # upload even if scan found vulnerabilities
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: trivy-results.sarif

Pin the Trivy action to a SHA rather than @master in production: aquasecurity/trivy-action@a20de5420d57c4102486cdd9349b532d1d638e94. Supply-chain attacks on GitHub Actions are real — a compromised action with write permissions could exfiltrate registry credentials from your runner environment.

Base Image Update Strategy

Most CVEs in a container image come from the OS layer — not your application code. The fix is to rebuild on top of a patched base image. At big-tech companies, this process is automated:

Dependabot / Renovate for Docker — open a PR automatically when the upstream base image digest changes. Renovate is more configurable and widely preferred.
Nightly rebuild jobs — a scheduled pipeline that rebuilds every image from its pinned base tag. If alpine:3.20 received a patch, a nightly rebuild gets you the fix the next morning.
Digest pinning + automated updates — pin to FROM alpine:3.20@sha256:abc123 so builds are reproducible, but let Renovate update that SHA when a new digest is published.

# Example Renovate config in renovate.json — auto-update Docker digests
{
  "$schema": "https://docs.renovatebot.com/renovate-schema.json",
  "extends": ["config:base"],
  "docker": {
    "enabled": true
  },
  "packageRules": [
    {
      "matchManagers": ["dockerfile"],
      "matchUpdateTypes": ["digest", "patch", "minor"],
      "automerge": true,
      "automergeType": "pr",
      "requiredStatusChecks": ["build-and-scan"]
    }
  ]
}

# Pin base image to digest in Dockerfile
FROM python:3.12-slim@sha256:4afe3b39b7f1bc61f946ba769ebd5ce48e61d0bda340b2daef34437d32b5f61f

Never auto-merge base image updates without a required passing scan job. A new base image digest can introduce new CVEs while fixing old ones. The requiredStatusChecks gate in Renovate (or your CI branch protection rule) ensures the scan must pass before the PR merges. Skipping this has caused multiple production incidents where a "security patch" introduced a regression.

Vulnerability Exceptions (VEX / .trivyignore)

In a mature pipeline, some CVEs are legitimately acceptable — the library is present but the vulnerable code path is unreachable, or a fix does not yet exist and the risk is formally accepted. Recording these decisions as code prevents them from being silently forgotten and re-opened on every scan run.

# .trivyignore — suppress specific CVEs with documented justification
# Format: CVE-ID [expiry-date] [comment]

# CVE-2023-44487 (HTTP/2 Rapid Reset) — mitigated by upstream ALB WAF rule
CVE-2023-44487

# CVE-2024-21626 (runc container escape) — fixed in our pinned runc version
# Trivy false-positive due to OS pkg version string mismatch; expires 2025-01-01
CVE-2024-21626 exp:2025-01-01

# Grype equivalent: .grype.yaml
# ignore:
#   - vulnerability: CVE-2023-44487
#     reason: "Mitigated at ALB layer; expires 2025-06-01"
#     expires: "2025-06-01T00:00:00Z"

Treat every suppressed CVE as technical debt. Set an expiry date, link to the JIRA/GitHub issue where the waiver was formally approved, and assign an owner. Security teams at mature companies audit suppression lists quarterly — undocumented, expired suppressions are a compliance finding, not just a best-practice gap.

Summary

Image scanning is not a checkbox — it is a continuous process. Use Trivy or Grype in CI as a hard gate on Critical/High findings with available fixes. Triage aggressively using EPSS alongside CVSS severity. Automate base image updates with Renovate, gated behind your scan job. Record accepted risk in version-controlled suppression files with expiry dates. This posture — scan, triage, automate, document — is what separates professional container security from security theatre.