Advanced Docker & Container Security

Multi-Stage Builds

18 min Lesson 1 of 28

Multi-Stage Builds

Every production image you ship is also an attack surface. A container that carries a full compiler toolchain, test frameworks, and build caches alongside your application binary is not just wasteful — it is a liability. An attacker who gains code execution inside that container inherits every tool you left behind. Multi-stage builds solve this by separating the environment that compiles your software from the environment that runs it, so the final image contains only the artifacts that belong in production.

The Problem with Single-Stage Images

Before multi-stage builds (Docker 17.05, 2017), teams typically wrote one Dockerfile and either accepted a bloated image or maintained a fragile two-script dance: a build script on the host, then a copy into the image. Both approaches have well-known failure modes at scale:

Image bloat: A typical Go binary compiles to ~10 MB; the golang:1.22 base image is ~850 MB. Shipping that to 500 nodes on every deploy wastes bandwidth, slows pod startup, and inflates your container registry bill.
Leaked secrets: Build stages often need credentials — package repository tokens, SSH keys, NPM_TOKEN. A RUN rm -rf ~/.ssh does not remove secrets from the image; earlier layers are still present and extractable with docker history --no-trunc or docker save.
Expanded CVE exposure: gcc, make, curl, git, and similar tools carry CVEs continuously. Scanners such as Grype and Trivy will flag them even if they are never reachable at runtime.

How Multi-Stage Builds Work

Each FROM instruction in a Dockerfile starts a new, independent build stage. You can copy artifacts produced in one stage into a later stage using COPY --from=<stage>. Only the final stage is committed to the image tag; all intermediate stages are discarded at build time. The Docker daemon still caches each stage individually, so rebuild times remain fast.

Multi-stage build: the builder stage produces the binary; only the binary crosses into the lean runtime stage.

A Production-Grade Go Example

The following Dockerfile reflects patterns used in production Go services at scale. Every instruction has a deliberate reason.

# syntax=docker/dockerfile:1.7
# ──────────────────────────────────────────────
# Stage 0 — dependency cache
# ──────────────────────────────────────────────
FROM golang:1.22-alpine AS deps
WORKDIR /src
# Copy module files first; Docker caches this layer until go.mod/go.sum change.
COPY go.mod go.sum ./
RUN --mount=type=cache,target=/root/.cache/go \
    go mod download -x

# ──────────────────────────────────────────────
# Stage 1 — build
# ──────────────────────────────────────────────
FROM deps AS builder
COPY . .
RUN --mount=type=cache,target=/root/.cache/go \
    CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
    go build -ldflags="-s -w" -trimpath -o /out/api ./cmd/api

# ──────────────────────────────────────────────
# Stage 2 — runtime (scratch = nothing at all)
# ──────────────────────────────────────────────
FROM scratch AS runtime
# Pull in TLS CA certs from Alpine (scratch has none).
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
# Drop the compiled binary only.
COPY --from=builder /out/api /api
# Run as non-root UID; scratch has no /etc/passwd so we use numeric IDs.
USER 10001:10001
ENTRYPOINT ["/api"]

Key decisions to understand and defend in a code review:

CGO_ENABLED=0 produces a statically-linked binary with zero shared library dependencies, making FROM scratch viable.
-ldflags="-s -w" strips the symbol table and DWARF debug info, shrinking the binary by 20–30 %.
-trimpath removes local filesystem paths embedded in the binary, avoiding accidental host-path leaks in stack traces.
The deps stage is intentionally separate from the builder stage so that a code-only change does not re-download modules.
--mount=type=cache is a BuildKit cache mount — the Go module cache persists across builds on the same host without ever appearing as a committed layer.

Layer cache ordering is critical. Always copy dependency manifests (go.mod, package.json, requirements.txt) and install dependencies before copying source code. Because source code changes on every commit, placing a COPY . . instruction before go mod download would bust the cache on every build and defeat the purpose of multi-stage caching.

Node.js / TypeScript Example

Interpreted-language projects still benefit from multi-stage builds: you can transpile TypeScript, run npm ci with dev dependencies, and ship only the compiled JavaScript and production node_modules.

# syntax=docker/dockerfile:1.7
FROM node:22-alpine AS base
WORKDIR /app
COPY package.json package-lock.json tsconfig.json ./

FROM base AS dev-deps
RUN --mount=type=cache,target=/root/.npm \
    npm ci

FROM dev-deps AS build
COPY src ./src
RUN npm run build        # tsc outputs to /app/dist

FROM base AS prod-deps
RUN --mount=type=cache,target=/root/.npm \
    npm ci --omit=dev

FROM node:22-alpine AS runtime
WORKDIR /app
ENV NODE_ENV=production
COPY --from=prod-deps /app/node_modules ./node_modules
COPY --from=build     /app/dist         ./dist
USER node
CMD ["node", "dist/server.js"]

Targeting Specific Stages

Multi-stage Dockerfiles double as a build matrix. You can build intermediate stages directly, which is useful for running tests inside the build environment without polluting the runtime image:

# Build and run unit tests — fails the CI pipeline if tests fail.
docker build --target dev-deps --tag myapp:test .
docker run --rm myapp:test npm test

# Build the final production image only after tests pass.
docker build --tag myapp:latest .

Cache mounts in CI. On GitHub Actions or GitLab CI, export the BuildKit cache to an OCI manifest and store it in your registry. This gives you the same fast-rebuild behavior as local development without maintaining a separate cache volume:

docker buildx build --cache-from type=registry,ref=ghcr.io/org/app:cache --cache-to type=registry,ref=ghcr.io/org/app:cache,mode=max .

Production Failure Modes

Multi-stage builds surface a class of bugs that single-stage builds hide:

Missing runtime libraries. If you did not use CGO_ENABLED=0 (or equivalent static linking), your binary may depend on glibc or other shared objects that exist in Alpine but not in scratch or distroless. The container starts and immediately exits with not found. Fix: use ldd /out/binary in the builder stage or switch to a distroless-glibc base.
Missing timezone data. FROM scratch has no /usr/share/zoneinfo. If your application calls time.LoadLocation, it will panic at runtime. Fix: COPY --from=builder /usr/share/zoneinfo /usr/share/zoneinfo.
Build args leaking into the final stage. A ARG declared in stage 0 is not automatically available in stage 2. Re-declare ARG after each FROM when you need it — but never pass secrets as ARG; use --mount=type=secret instead.

Do not use ADD to pull remote tarballs in multi-stage builds. ADD https://example.com/tool.tar.gz /opt/ is not cached by content hash — it re-fetches on every build. Use RUN curl | tar -xz inside a stage with a --mount=type=cache, or better yet, pin to a specific digest using an explicit FROM for that tool.

Measuring Impact

After building, always verify the improvement with docker image inspect and your vulnerability scanner:

# Compare sizes
docker images --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}"

# Count CVEs in each image (requires grype)
grype myapp:single-stage --output table
grype myapp:multi-stage  --output table

# Inspect layer structure
docker history myapp:multi-stage --no-trunc

At Google-scale, a 10 x image size reduction translates directly to faster cold-start times on Kubernetes (image pull is often the dominant factor in pod scheduling latency), lower registry storage costs, and a measurably smaller CVE backlog for your security team. Multi-stage builds are not optional hygiene — they are table stakes for any image that ships to production.