Multi-Tenancy & Guardrails in Platforms
Multi-Tenancy & Guardrails in Platforms
When a platform team serves dozens or hundreds of product teams from shared infrastructure, every design decision about tenant isolation, quotas, and secure defaults has blast-radius implications. A misconfigured namespace can starve a revenue-critical service of CPU; a missing NetworkPolicy lets a compromised pod exfiltrate secrets across team boundaries; an absent LimitRange allows a runaway container to OOM the entire node. At Google scale, Borg enforces per-cell quota through the Borg Master; at Stripe, every microservice inherits a Kubernetes Namespace with pre-baked RBAC, ResourceQuotas, and NetworkPolicies generated by their internal "service provisioner." This lesson covers how to build that discipline into your own platform.
Tenant Isolation Models
Platform tenants are typically mapped to one of three boundaries, each offering a different isolation/cost trade-off:
- Namespace-per-team — the most common Kubernetes model. Teams share a cluster but are separated by RBAC, NetworkPolicy, and ResourceQuota. Cost-efficient; isolation is logical, not physical. A node compromise crosses namespace lines.
- Node-pool-per-tenant — workloads for sensitive teams land on dedicated node pools via
nodeSelectorornodeAffinitycombined withTaint/Toleration. Noisy-neighbour risk is gone; node cost is higher. Common for PCI/HIPAA tenants. - Cluster-per-tenant — full isolation. Used by hyperscalers offering managed Kubernetes (EKS, GKE) to external customers, or internally for compliance domains. vCluster and Loft bring this model back inside a single host cluster at near-namespace cost.
Kubernetes ResourceQuota and LimitRange
Every namespace a platform provisions should receive both a ResourceQuota (cluster-wide hard caps) and a LimitRange (per-container defaults and maximums). Omitting LimitRange means a pod without explicit requests/limits runs with unbounded CPU/memory — the most common cause of noisy-neighbour incidents on shared clusters.
Enforce quota exhaustion alerting: a Prometheus rule on kube_resourcequota{type="used"} / kube_resourcequota{type="hard"} > 0.85 fires a warning before the namespace is full and new pods start Pending. Without this alert, engineers discover the quota limit only when their deployment silently fails during an incident.
Network Isolation with NetworkPolicy
By default, Kubernetes allows all pod-to-pod traffic. In a multi-tenant platform you must invert this: deny all by default, then add explicit allow rules. The baseline policy below is templated into every new namespace by the platform provisioner.
netpol-tester pod that probes cross-namespace connectivity and asserts it is blocked.
Admission Control: Policy as Guardrails
ResourceQuota and NetworkPolicy are reactive — they constrain what already exists. Admission policies are proactive: they block or mutate workloads at creation time. A mature platform uses Kyverno or OPA/Gatekeeper to enforce the rules that would otherwise require human code review at scale.
Critical policies every platform should ship out of the box:
- Require non-root containers — deny any pod where
securityContext.runAsNonRoot != true. - Disallow privileged containers — deny
securityContext.privileged: trueandallowPrivilegeEscalation: true. - Require image digest or allowed registry — deny
latesttag; only allow pulls from your internal registry or a curated allowlist. - Require resource requests and limits — deny pods that would bypass your LimitRange defaults (belt-and-suspenders).
- Require team labels — deny any workload missing
app.kubernetes.io/teamandapp.kubernetes.io/servicelabels (enables cost attribution and on-call routing).
Secure Defaults: The Baseline Security Context
Rather than relying on developers to write correct security contexts, the platform should mutate pod specs at admission time to inject secure defaults. A Kyverno mutate rule or a Kubernetes PodAdmission Standard (restricted profile at the namespace level via pod-security.kubernetes.io/enforce: restricted label) provides this automatically.
At minimum, every workload baseline should include:
runAsNonRoot: truereadOnlyRootFilesystem: trueallowPrivilegeEscalation: falseseccompProfile.type: RuntimeDefault(uses the container runtime's default seccomp filter)capabilities.drop: [ALL]— add back only what is explicitly required (e.g.,NET_BIND_SERVICEfor port 80)
Cost Attribution and Chargeback
Multi-tenancy without cost visibility creates a tragedy of the commons: teams over-provision because they do not see the bill. The platform should emit per-namespace cost data. OpenCost (CNCF) or Kubecost can be deployed as a platform service to produce per-team spend reports ingested into your internal cost dashboard. The mandatory app.kubernetes.io/team label enforced by your Kyverno policy is the join key.
At Amazon, every service has a mandatory cost tag that gates deployment — a workload without a valid cost centre cannot be deployed to production. Your admission policy enforcing team labels is the Kubernetes equivalent of that gate.
vCluster for Stronger Isolation Without Full Cluster Cost
When a team needs cluster-level isolation — separate API server, separate RBAC universe, separate admission webhooks — but provisioning a full EKS/GKE cluster per team is prohibitively expensive, vcluster (by Loft Labs) virtualises a Kubernetes control plane inside a namespace of the host cluster. The tenant sees a real API server; the host cluster sees only pods. This is the pattern Datadog and several large SaaS platforms use for their own multi-tenant Kubernetes offerings.
Guardrail Governance: Audit, Override, and Escape Hatches
Rigid policies that block legitimate work generate shadow-IT: engineers work around the platform instead of with it. Every guardrail should have a documented, audited escape hatch. In Kyverno, policies can run in Audit mode (violations logged, not blocked) before promotion to Enforce. A platform namespace annotation — platform.io/policy-exception: approved-by=security-team,ticket=SEC-1234 — combined with a Kyverno PolicyException object provides an audited override without removing the policy cluster-wide.