Kubernetes Fundamentals

Debugging Pods & Workloads

18 min Lesson 9 of 32

Debugging Pods & Workloads

When something breaks in a Kubernetes cluster, the blast radius is often invisible at first. A Pod shows CrashLoopBackOff on the dashboard, traffic to a Service silently drops, or a Deployment stalls at 2 of 3 desired replicas and never converges. Unlike a process running on a server you can SSH into and observe directly, Kubernetes abstracts the runtime behind several layers — the API server, the kubelet, the container runtime, the overlay network. Effective debugging means knowing exactly which layer to interrogate and which kubectl command surfaces that layer.

This lesson gives you the full systematic toolkit used by SREs at companies operating large Kubernetes fleets: the four canonical debugging commands, how to read each output, the three most common pod failure states and their root causes, and the structured mental model that gets you from alert to fix in minutes rather than hours.

The Debugging Hierarchy

Always work from the broadest view down to the narrowest: cluster events → object description → container logs → live shell. Jumping straight to logs before checking events is the most common mistake — you miss the scheduler decision, the image pull failure, or the probe rejection that explains everything.

The four-command debugging hierarchy and the failure states each layer helps diagnose.

kubectl get events — The Cluster Audit Trail

Events are Kubernetes' own structured log of everything that happened to every object in a namespace. Unlike container logs (which only exist while the container runs), events are written by the control plane — the scheduler, the kubelet, the deployment controller — and persist for one hour by default. When a Pod fails immediately before you can read its logs, events are often the only record of why.

# All events in current namespace, most recent last
kubectl get events --sort-by='.lastTimestamp'

# Filter to events for one specific Pod (field selector on involvedObject)
kubectl get events --field-selector involvedObject.name=<pod-name>

# Watch events live as they arrive (invaluable during a rolling deployment)
kubectl get events -w

# Show events across ALL namespaces (useful in a multi-tenant cluster)
kubectl get events -A --sort-by='.lastTimestamp' | tail -40

Key idea: Events have a Reason field (e.g. FailedScheduling, BackOff, Pulled, Started, Killing) and a Message field with human-readable detail. The Reason alone often tells you which layer is broken — scheduler, image pull, probe, or OOM killer.

kubectl describe — Full Object State + Event History

kubectl describe pod <name> is the single richest debugging command. It renders the full object spec merged with live status fields and appends the event stream scoped to that object. Learn to read it in sections:

Status / Phase: Pending, Running, Succeeded, Failed, Unknown — the coarse signal.
Conditions: Boolean flags like PodScheduled, Initialized, ContainersReady, Ready. If PodScheduled=False, the problem is in scheduling, not in your image.
Containers → State: Waiting (with a Reason), Running, or Terminated (with ExitCode). Exit code 137 = OOM kill; 1 = application error; 0 = clean exit that should not have happened.
Containers → Last State: The previous container run — critical for CrashLoopBackOff; shows the exit code from the last crash.
Events: Scoped to this Pod — shows image pull progress, probe failures, scheduling decisions.

# Full describe of a Pod
kubectl describe pod <pod-name>

# Describe the owning Deployment (shows rollout status, selector, strategy)
kubectl describe deployment <deploy-name>

# Describe a Node — see resource pressure, taints, conditions, allocated Pods
kubectl describe node <node-name>

# Quick pattern: grab the Events section only
kubectl describe pod <pod-name> | grep -A 30 "^Events:"

kubectl logs — Reading Container Output

kubectl logs fetches stdout and stderr from the container runtime. When a Pod has crashed, pass --previous to read the logs of the dead container rather than the current (empty) one.

# Current logs for the first container in a Pod
kubectl logs <pod-name>

# Logs for a specific container in a multi-container Pod
kubectl logs <pod-name> -c <container-name>

# Logs of the PREVIOUS (crashed) container instance — essential for CrashLoopBackOff
kubectl logs <pod-name> --previous

# Stream logs live (equivalent to tail -f)
kubectl logs -f <pod-name>

# Last 100 lines with timestamps
kubectl logs --tail=100 --timestamps <pod-name>

# Logs from ALL Pods matched by a label selector (e.g. all replicas of a Deployment)
kubectl logs -l app=payment-service --prefix --tail=50

Production tip — centralised logging: kubectl logs reads from the node's local container log files (/var/log/pods/). When a Pod is evicted or the node is drained, those files can disappear. In production, always ship logs to a centralised store (Loki, OpenSearch, Datadog) — kubectl logs is for fast triage, not the primary log archive.

kubectl exec — Live Shell Inside a Running Container

kubectl exec opens a process inside an already-running container. Use it to inspect the filesystem, test network reachability from inside the Pod's network namespace, or validate environment variables and mounted secrets that your app will actually see.

# Interactive shell (bash or sh if bash is not in the image)
kubectl exec -it <pod-name> -- bash
kubectl exec -it <pod-name> -- sh

# Run a one-shot command (no TTY needed)
kubectl exec <pod-name> -- env | grep DATABASE
kubectl exec <pod-name> -- cat /etc/config/app.yaml
kubectl exec <pod-name> -- wget -qO- http://localhost:8080/healthz

# Test DNS resolution from inside the Pod
kubectl exec <pod-name> -- nslookup kubernetes.default.svc.cluster.local

# Test connectivity to another Service
kubectl exec <pod-name> -- curl -s http://payment-service.payments.svc.cluster.local/health

# In a specific container of a multi-container Pod
kubectl exec -it <pod-name> -c sidecar -- sh

Production pitfall — distroless / scratch images: Many security-hardened production images (distroless, scratch-based) contain no shell, no curl, no wget. kubectl exec -- bash will fail with "OCI runtime exec failed." Use an ephemeral debug container instead: kubectl debug -it <pod-name> --image=busybox --target=<container-name>. This injects a debug sidecar that shares the target container's process namespace without modifying the running Pod spec.

Common Pod Failure States

ImagePullBackOff (and ErrImagePull)

The kubelet cannot pull the container image from the registry. ErrImagePull is the first attempt; after several retries with exponential back-off (5 s, 10 s, 20 s … capped at 5 min), the state becomes ImagePullBackOff. Root causes, in order of frequency:

Tag does not exist — a typo in the image tag, or a CI pipeline that failed to push the new tag before the Deployment was updated.
Wrong or missing imagePullSecret — the registry requires authentication (ECR, GCR, GHCR, private Docker Hub) and the Pod spec does not reference a valid secret, or the secret is in the wrong namespace.
Registry rate limit — Docker Hub's anonymous pull limit (100/6 h) hit by a cluster with many nodes all pulling the same image without credentials.
Network policy or firewall — the node cannot reach the registry endpoint (common in air-gapped or VPC-restricted environments).

# 1. Check the exact error message in Pod events
kubectl describe pod <pod-name> | grep -A 5 "Failed to pull\|ImagePull\|BackOff"

# 2. Verify the image reference is correct
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[*].image}'

# 3. Check the imagePullSecret is present and in the right namespace
kubectl get secret <pull-secret-name> -n <namespace> -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d | python3 -m json.tool

# 4. Manually test the pull from a node (if you have node SSH access)
crictl pull <image-reference>

CrashLoopBackOff

The container starts, runs briefly, then exits with a non-zero code (or sometimes zero). Kubernetes restarts it. The restart loop continues with exponential back-off (10 s, 20 s, 40 s … capped at 5 min). After enough crashes the state stabilises as CrashLoopBackOff, which is Kubernetes communicating: "I keep trying, and it keeps failing." Root causes:

Application startup error — the process cannot connect to a database, reads a required environment variable that is missing or has the wrong value, or fails a startup validation check.
Liveness probe misconfiguration — the probe threshold is too aggressive (e.g. initialDelaySeconds: 0 on a slow-starting Java service). Kubernetes kills the container before it finishes starting, triggering a loop.
OOM kill on startup — the memory limit is too low for the process to initialise. Check for exit code 137 in Last State.
Missing configuration — a ConfigMap or Secret the app depends on is not mounted, or was mounted at a path the app does not expect.

# The essential CrashLoopBackOff workflow:

# Step 1: confirm the state and check restart count
kubectl get pod <pod-name>
# NAME          READY   STATUS             RESTARTS   AGE
# api-7d9f4b    0/1     CrashLoopBackOff   8          12m

# Step 2: read the crash logs from the PREVIOUS container instance
kubectl logs <pod-name> --previous

# Step 3: check the exit code from the last crash
kubectl describe pod <pod-name> | grep -A 10 "Last State:"
# Last State:     Terminated
#   Reason:       Error
#   Exit Code:    1
#   Started:      ...
#   Finished:     ...

# Step 4: if logs are empty (process crashed before writing output),
#         check whether env vars / secrets are correctly mounted
kubectl exec <pod-name> -- env       # might fail if container is already dead
kubectl get pod <pod-name> -o yaml | grep -A 20 "env:\|envFrom:\|volumeMounts:"

Pending — The Pod That Never Starts

A Pod stuck in Pending has been accepted by the API server but the scheduler has not placed it. The event reason is almost always FailedScheduling. The message will tell you exactly which constraint failed:

Insufficient CPU / memory — no node has enough unallocated resources. Solution: scale the node group, reduce requests, or check for forgotten Pods consuming resources.
No nodes match node selector or affinity — a nodeSelector requires a label that no node has.
PersistentVolumeClaim not bound — the Pod requires a PVC that is in Pending state (no matching PV, or the StorageClass is wrong).
Taint not tolerated — all available nodes have a taint the Pod does not tolerate.

# See why a Pod is Pending
kubectl describe pod <pending-pod-name> | grep -A 10 "Events:"
# Events:
#   Warning  FailedScheduling  2m  default-scheduler
#             0/3 nodes are available: 3 Insufficient cpu.

# See how much CPU/memory is allocated across nodes
kubectl describe nodes | grep -E "Name:|  cpu|  memory" | grep -v "^--"

# Or use the top command (requires metrics-server)
kubectl top nodes

Production tip — kubectl get pod -o wide: Always pass -o wide when reviewing Pods in bulk. It adds the Node column, the Pod IP, and the nominated node for Pending Pods. In a multi-node cluster, seeing all replicas of a service land on the same node immediately signals an anti-affinity rule that is missing.

Reading the Full Picture: a Structured Runbook

When an alert fires or a user reports a 503, use this sequence every time — no improvising:

Run kubectl get pods -n <ns> — identify which Pods are not Running 1/1.
Run kubectl get events -n <ns> --sort-by='.lastTimestamp' | tail -30 — cluster-level signal.
Run kubectl describe pod <name> — read Conditions, Container State, Last State, Events.
Run kubectl logs <name> --previous if the container crashed; kubectl logs -f <name> if running but misbehaving.
Run kubectl exec -it <name> -- sh to validate connectivity and config from inside the Pod's network namespace.
If the Pod is healthy but the Service is dropping traffic, inspect the Endpoints: kubectl get endpoints <service-name> — empty endpoints mean the label selector does not match any running Pod.

Every step surfaces a different layer. Skipping any one of them risks chasing the wrong hypothesis for an hour.