Project: A Hardened Production Image
Project: A Hardened Production Image
The previous nine lessons each targeted one dimension of container security in isolation. This capstone project synthesises all of them into a single, end-to-end workflow: take a realistic Node.js API, and drive it through multi-stage minimisation, image scanning, supply-chain signing, and non-root hardening. The result is a production image you could defend in a security review at any top-tier company.
We will work with a concrete application structure and apply every hardening step in order, showing the exact commands, the measurable improvement after each step, and the integration points for a CI/CD pipeline.
Starting Point: The Naive Image
Most teams start here — a single-stage Dockerfile based on the official full Node image, running as root:
A quick assessment of this image reveals three immediate problems: it is ~1.1 GB, it runs as root (UID 0), and it includes every development dependency plus the full npm toolchain. Grype finds around 300 CVEs in the base image alone.
Step 1 — Multi-Stage Build to Minimise Attack Surface
The first transformation separates build-time tooling from runtime assets. Production node_modules only includes packages listed under dependencies in package.json, not devDependencies.
Image size drops from ~1.1 GB to ~180 MB. CVE count drops by roughly 70 % because the build toolchain is gone. But we are still running as root — move to the next step.
Step 2 — Non-Root User & Read-Only Filesystem
The node:20-alpine image ships with a pre-created node user (UID 1000). All we need to do is own the files and switch before the entrypoint:
When deploying to Kubernetes, pair this with a SecurityContext that locks down the pod further:
readOnlyRootFilesystem: true matters. A container with a writable root can be exploited to overwrite its own binary, install a backdoor, or write a cron job. Read-only root forces all writes to explicit emptyDir or persistent volume mounts, which ops teams can audit. An application that cannot handle a read-only root has a hidden assumption worth fixing at the code level.
Step 3 — Scan Before You Push
Integrate Grype into your CI pipeline so a build with critical CVEs fails before the image ever reaches the registry:
In GitHub Actions this looks like:
Step 4 — Sign the Image with Cosign (Sigstore)
Signing closes the supply-chain gap between CI and production. Without signing, nothing prevents someone from pushing a different image under the same tag to your registry. With Cosign keyless signing (backed by Sigstore's transparency log), every image carries a cryptographic receipt that ties it to the specific GitHub Actions workflow run that produced it.
In Kubernetes, enforce this at admission with Policy Controller or Kyverno, so unsigned images are rejected before they ever reach a node:
The Complete Hardening Pipeline
The diagram below shows all four stages wired together in a single CI/CD flow, from a developer push to a verified, policy-compliant image running in production.
Verification Checklist
After building your hardened image, run through this checklist before tagging it as production-ready. At companies like Google and Netflix, this checklist is enforced automatically in CI — failing any check blocks the deployment.
node:20-alpine can change without warning when the upstream maintainer pushes a patch. Pin to the immutable digest in your Dockerfile and update it deliberately via Dependabot or Renovate:
FROM node:20-alpine@sha256:a7f5...
This also makes cosign verification stronger: the digest is part of the signed payload, so any tampering with the base layer breaks the signature.
Production Failure Modes to Know
Even well-hardened images surface runtime surprises. These are the most common failures teams hit after switching to this setup:
- App writes to
/appat runtime. Many frameworks write lock files, compiled templates, or upload staging directories under the working directory. WithreadOnlyRootFilesystem: true, these writes panic. Fix: mount anemptyDirat the specific writable path and point the framework config to it, not the working directory. - Missing CA certificates. Alpine ships
ca-certificatesbut if you start fromscratchor a stripped distroless variant, TLS connections to external services fail with certificate signed by unknown authority. Fix:COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/. - Port binding below 1024 as non-root. Ports 80 and 443 require
CAP_NET_BIND_SERVICE. Running as UID 1000 without that capability and trying to bind port 80 gives permission denied. Fix: bind on port 3000 (or any unprivileged port) and let the Kubernetes Service or load balancer handle port 80/443 externally. - Signal handling. When PID 1 is
node(not an init process),SIGTERMfromkubectl rollout restartmay not propagate correctly to child processes, leaving zombie processes. Fix: use["dumb-init", "node", "dist/index.js"]— adddumb-initfrom Alpine and set it as the entrypoint wrapper.
USER root to fix a permissions issue or upgrades a base image without re-scanning. Enforce these controls in CI, in admission webhooks, and in your runtime security tool (Falco, Sysdig). Treat a running container that spawns a shell or writes to unexpected paths as an incident, not a misconfiguration.
You now have a repeatable, auditable path from source code to a production image that passes a security review at big-tech standards: minimal (180 MB vs 1.1 GB), non-root (UID 1000), scanned (zero critical CVEs), signed (cosign + Rekor), and policy-enforced (Kyverno admission). Run this workflow on every merge to main and you have a continuous, measurable security posture — not a point-in-time audit.