Cloud & Kubernetes Security Hardening

Runtime Security

18 min Lesson 7 of 28

Runtime Security

Static controls — image scanning, Pod Security Standards, network policies — stop known-bad configurations before a workload starts. Runtime security is the discipline that answers the harder question: what is a container actually doing once it is running? A supply-chain attack, a zero-day exploit, or a malicious insider can bypass every pre-flight check and land a threat inside a legitimate container. Runtime security is your last line of detection before exfiltration.

The three pillars of cloud-native runtime security are syscall-level detection (via Falco or eBPF-based engines), seccomp syscall profiles (deny at the kernel level), and container drift detection (alert when a running container no longer matches its immutable image). This lesson covers all three, including the failure modes that bite production teams.

How Falco Works: Syscall Visibility from the Kernel

Falco (CNCF graduated) sits between the kernel and user space. It uses either a kernel module or an eBPF probe to intercept every syscall made by every process in every container on the node. Those raw events are fed into a rule engine that evaluates them against a set of declarative rules. When a rule fires, Falco emits a structured alert — to stdout, syslog, a webhook, or directly to a SIEM like Splunk or Elastic.

Falco runtime detection data flow Linux Kernel syscall interface (open, execve, connect, write, setuid, …) Falco eBPF Probe (kernel module / CO-RE) Falco Rule Engine conditions + macros + lists stdout / log (Fluentd → SIEM) Webhook (Slack / PagerDuty) Falcosidekick (fan-out router) Response Plugin (kill / quarantine Pod)
Falco intercepts syscalls via an eBPF probe, evaluates them in the rule engine, and fans alerts out to logging pipelines, webhooks, and automated response plugins.

Falco ships with a default ruleset covering the most critical categories: shell spawned inside a container, sensitive file reads (/etc/shadow, kubeconfig), unexpected network outbound connections, privilege escalation syscalls (setuid, ptrace), and modifications to container filesystems after startup. Production teams layer custom rules on top.

# Install Falco via Helm (eBPF driver — no kernel module needed) helm repo add falcosecurity https://falcosecurity.github.io/charts helm repo update helm upgrade --install falco falcosecurity/falco \ --namespace falco --create-namespace \ --set driver.kind=ebpf \ --set falcosidekick.enabled=true \ --set falcosidekick.webui.enabled=true \ --set "falcosidekick.config.slack.webhookurl=https://hooks.slack.com/YOUR_HOOK" \ --set falco.grpc.enabled=true \ --set falco.grpcOutput.enabled=true # Custom rule — alert when any process writes to /usr/bin inside a running container # /etc/falco/rules.d/custom-drift.yaml - rule: Write to binary dir in container desc: Detects a process writing into /usr/bin or /usr/local/bin inside a container condition: > spawned_process and container and (proc.name in (cp, mv, install, pip, npm, apt, yum)) and fd.directory in (/usr/bin, /usr/local/bin, /usr/sbin) output: > Binary directory modified in container (user=%user.name cmd=%proc.cmdline container=%container.name image=%container.image.repository) priority: CRITICAL tags: [drift, container, filesystem]
eBPF vs kernel module: For production Kubernetes clusters, prefer the eBPF driver (driver.kind=ebpf) or the modern CO-RE eBPF driver (driver.kind=modern-ebpf). The kernel module requires recompilation on every kernel upgrade and can destabilize the node; eBPF is safer and the path all major cloud vendors support. AWS EKS and GKE both support CO-RE eBPF without node customization.

Seccomp Profiles: Denying at the Kernel Boundary

Detection tells you when something bad happened. Seccomp (Secure Computing Mode) prevents it by restricting which syscalls a container is even allowed to make. A container running a Node.js API does not need ptrace, mount, or kexec_load. A seccomp profile is a JSON allowlist (or blocklist) enforced by the kernel — attempting a blocked syscall returns EPERM or kills the process.

Kubernetes supports two ways to apply seccomp: the built-in RuntimeDefault profile (the container runtime's own default, which blocks ~44 syscalls that should never be needed) and custom Localhost profiles stored on the node.

# Enable RuntimeDefault seccomp for a Pod — lowest-friction win apiVersion: v1 kind: Pod metadata: name: api-server spec: securityContext: seccompProfile: type: RuntimeDefault # uses containerd / CRI-O default profile containers: - name: app image: myorg/api:v2.1.0 securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 10001 capabilities: drop: ["ALL"] --- # Custom Localhost profile — pinned to exactly the syscalls your app needs # Save as /var/lib/kubelet/seccomp/profiles/node-api.json on each node # Then reference it: spec: securityContext: seccompProfile: type: Localhost localhostProfile: profiles/node-api.json

Generating a tight custom profile from scratch is tedious. The practical workflow at scale: run the workload with type: Unconfined and Falco logging syscalls, or use inspektor-gadget (ig advise seccomp-profile) to record the actual syscall profile of a running container and generate a JSON policy automatically. Then promote to Localhost in staging, validate, and ship to production.

Enable RuntimeDefault cluster-wide: With Kubernetes 1.27+ you can set --seccomp-default on the kubelet to apply RuntimeDefault to every Pod that does not specify a seccomp profile. This is a zero-friction cluster-wide win that blocks dozens of dangerous syscalls without touching any workload manifest.

Container Drift Detection

An immutable container image is a security contract: what you scanned and signed at build time is exactly what runs in production. Container drift breaks that contract — a process installs a tool, downloads a binary, or modifies a config file inside the running container's writable layer. Even a legitimate developer doing kubectl exec ... apt-get install curl for debugging creates drift that could persist if the container is not restarted.

Drift detection approaches range from lightweight to comprehensive:

  • Falco rules (layer 1): Detect writes to binary directories, package manager executions, or new binary downloads via syscall events — fires within milliseconds.
  • Read-only root filesystems (layer 2): Set readOnlyRootFilesystem: true in the container's securityContext. Any write attempt returns EROFS. Combined with emptyDir or tmpfs mounts for paths that genuinely need writes (logs, caches), this eliminates an entire class of drift.
  • Image digest pinning (layer 3): Reference images by SHA256 digest (myorg/api@sha256:abc123...) rather than a mutable tag. A drifted tag can silently pull a different image on Pod restart; a digest cannot.
  • Admission-time attestation (layer 4): Tools like Sigstore/Cosign + Kyverno or OPA enforce that only images signed by your CI pipeline are admitted. Drift via image substitution is blocked entirely.
The kubectl exec audit gap: Most security teams monitor image deployments but forget that kubectl exec into a running Pod bypasses all admission controls and image scanning. A developer who installs a package inside a container has created an ephemeral rootkit. Ensure your Kubernetes audit policy (--audit-policy-file) logs exec events at the Request level and pipes them into your SIEM. Falco also has a default rule for this: Terminal shell in container.

Tying It Together: Detection-to-Response in Production

Detection without automated response is just alerting fatigue. The production pattern at big-tech scale is a tiered response loop: Falco fires a webhook to Falcosidekick, which routes CRITICAL events to a response bot. The bot calls the Kubernetes API to cordon the node, isolate the Pod with a network policy, capture a forensic snapshot (container state, open file descriptors via /proc), and page the on-call team. Lower-severity events go to the SIEM for analyst review.

Invest in tuning rule noise before you wire automated response. A rule with a 5% false-positive rate that kills production Pods will erode trust faster than the threat it was meant to stop. Start with audit mode, build a two-week baseline, suppress known-benign patterns with Falco's exceptions field, then flip to response automation.

ES
Edrees Salih
1 hour ago

We are still cooking the magic in the way!