Docker & Containerization

Images, Layers & the Registry

18 min Lesson 2 of 30

Images, Layers & the Registry

Every container you ever run begins as an image. Understanding how images are built, stored, and distributed is not optional knowledge — it directly shapes build speed, deployment reliability, and your security posture in production. At Google, Facebook, and Amazon, teams invest heavily in image hygiene because a poorly understood image model causes real-world incidents: cache misses that triple build time, multi-gigabyte layers that slow cold-start by minutes, and stale base images that ship unpatched CVEs to millions of users.

The Union Filesystem and the Layer Model

A Docker image is not a monolithic archive. It is an ordered stack of read-only layers, each layer representing the filesystem diff produced by a single instruction in a Dockerfile. The container runtime merges these layers at run time using a union filesystem (OverlayFS on modern Linux) into a single coherent view. On top of this read-only stack, the runtime adds one thin writable layer — the container layer — that holds all runtime mutations.

Why does this matter? Two images that share a common base — say, debian:12-slim — store that base only once on disk and in the registry. When you pull ten microservices that all derive from the same base, only the unique upper layers travel over the wire. This is how large fleets keep registry egress costs manageable and how developers get sub-second pulls for incremental changes.

Docker Image Layer Stack and Shared Base Image A (app-server) Container Layer (writable) Runtime changes only — ephemeral COPY ./app /app Layer 4 — app source (unique) RUN pip install -r requirements.txt Layer 3 — Python deps (unique) RUN apt-get install python3 Layer 2 — shared FROM debian:12-slim Layer 1 — base OS (shared) Image B (worker) Container Layer (writable) Runtime changes only — ephemeral COPY ./worker /worker Layer 4 — worker source (unique) RUN pip install celery Layer 3 — Celery deps (unique) RUN apt-get install python3 Layer 2 — shared FROM debian:12-slim Layer 1 — base OS (shared) pulled once Shared layers (cached locally & in registry) Image-unique layers Container layer (writable, ephemeral)
Two images share the same base OS and python3 layers. Only the unique upper layers are transferred on pull.

Content-Addressable Storage: the Layer ID and the Digest

Each layer is identified by a SHA-256 digest of its compressed tar archive. The image manifest — a JSON document — lists these digests in order. When you run docker pull, Docker first fetches the manifest, compares each layer digest against what it already has on disk, and downloads only the missing ones. This is not an optimisation — it is a correctness guarantee: a digest mismatch means tampering or corruption, and Docker refuses to use the layer.

The image itself also has a digest, computed from the manifest. This is what you see when you append @sha256:<hash> to an image reference. A tag — nginx:1.27 — is just a mutable pointer; the digest is immutable. In production deployments (Kubernetes, Argo CD), pin your images by digest, not by tag, to guarantee reproducibility across environments and prevent the "it works in staging" class of incidents where someone pushed a new tag between your test and production deploy.

Tag vs Digest: nginx:latest can refer to a completely different image tomorrow. nginx@sha256:a3ed... refers to exactly one immutable manifest, forever. CI/CD pipelines at scale always pin digests in the deployment manifests they commit to Git, then let an automated PR bot (Dependabot, Renovate) propose upgrades.

Pulling Images: What Actually Happens

Walk through a real pull to see all the moving parts:

# Pull a specific tag and watch layer resolution docker pull nginx:1.27-alpine # Inspect the manifest digest Docker resolved docker inspect nginx:1.27-alpine --format '{{index .RepoDigests 0}}' # nginx@sha256:a1b2c3d4... # Pull by digest — guaranteed identical image on every host docker pull nginx@sha256:a1b2c3d4e5f6... # List locally cached layers docker image ls --digests nginx # See per-layer details (uncompressed size, digest) docker image inspect nginx:1.27-alpine \ --format '{{range .RootFS.Layers}}{{println .}}{{end}}'

Each line of output in the pull represents one layer: Pulling from library/nginx → manifest fetch; then per-layer lines of Already exists (cache hit) or Pull complete. Monitor the ratio of cache hits to understand real registry egress in your fleet.

Pushing Images to a Registry

The workflow for pushing to Docker Hub, AWS ECR, or any OCI-compliant registry is the same: authenticate, tag with a fully-qualified name, then push. Only layers not already present on the server are uploaded — the same content-addressable deduplication applies on the push path.

# Authenticate to Docker Hub docker login -u myusername # Authenticate to AWS ECR (token expires every 12 hours — automate this in CI) aws ecr get-login-password --region us-east-1 \ | docker login --username AWS --password-stdin \ 123456789012.dkr.ecr.us-east-1.amazonaws.com # Tag a local image for ECR docker tag myapp:build-abc123 \ 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:1.4.2 # Push — only new layers are uploaded docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:1.4.2 # Also push an immutable semantic version AND update the mutable rolling tag docker tag 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:1.4.2 \ 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:stable docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:stable
Tagging strategy that scales: Tag every CI build with the Git SHA (myapp:git-a1b2c3d) for traceability, a semantic version (myapp:1.4.2) for deployments, and a rolling convenience tag (myapp:stable) for canary environments. Never rely solely on latest in any automated system.

Inspecting Layers with dive

The open-source tool dive gives you a layer-by-layer breakdown of what changed at each step, including wasted space from files added then deleted in separate RUN instructions. At Google-scale, teams run dive --ci in every CI pipeline and fail the build if wasted space exceeds a threshold. Run it locally before every image promotion to catch obvious bloat early.

# Install dive (macOS) brew install dive # Analyse an image interactively dive myapp:1.4.2 # Run in CI — fails if efficiency < 90% CI=true dive --ci myapp:1.4.2

Production Failure Modes to Know

Three layer-model bugs burn engineers repeatedly:

  1. Cache busting on the wrong layer. If you COPY your entire source tree before running RUN pip install, any single-file change invalidates the dependency layer. Always copy lock files first, install deps, then copy source. This can change a 4-minute build to 12 seconds.
  2. Secrets baked into layers. A RUN step that echoes a password creates a layer containing that secret. Deleting it in a subsequent RUN does not remove it from the layer below — it is still visible with docker history. Use build secrets (--secret), multi-stage builds, or external secret managers instead.
  3. Tag mutability causing deployment drift. If two replicas of a service are launched at different times and the tag was pushed between deploys, they run different binaries. Always pin digests in Kubernetes manifests that reach production.
Never store credentials in image layers. Even if you add a RUN rm /secrets step immediately after, the secret is permanently preserved in the preceding layer and accessible to anyone with pull access to the image. Use Docker BuildKit mounted secrets (RUN --mount=type=secret,id=mytoken ...) or build args that are never echoed.

OCI Compliance and the Broader Ecosystem

Docker popularised the image format, but the Open Container Initiative (OCI) now owns the specifications: the Image Spec, the Distribution Spec (how registries work), and the Runtime Spec. Any OCI-compliant tool — Podman, Buildah, containerd, Kaniko — can build, store, and run the same images. In Kubernetes, containerd is the standard container runtime (Docker was removed as the kubelet's default in 1.24), but it reads and runs the same OCI images your docker build produces. Understanding this prevents the false belief that "Docker images only work with Docker."