Multi-Tenancy Patterns
Multi-Tenancy Patterns
Multi-tenancy is the practice of running workloads for multiple teams, products, or customers on shared Kubernetes infrastructure. Done right, it cuts cost and operational overhead dramatically. Done wrong, it creates blast-radius problems where a noisy neighbor crashes your SLA or a misconfigured pod escapes its boundary. This lesson examines the three isolation levels used in production, the namespace primitives that enforce them, and the hard lessons big-tech platform teams have learned running shared clusters at scale.
The Isolation Spectrum: Namespace vs. Cluster
Kubernetes multi-tenancy sits on a spectrum between two extremes:
- Soft multi-tenancy (namespace-per-tenant): Multiple tenants share one cluster, separated by namespaces, RBAC, NetworkPolicies, and resource quotas. The kernel, container runtime, and control plane are shared. This is the default model at most big-tech platform teams for internal tenants (product teams) who are trusted but must be isolated from each other for cost allocation, blast radius, and security policy.
- Hard multi-tenancy (cluster-per-tenant): Each tenant gets a dedicated cluster — or at minimum a dedicated node pool with strict node selectors. Used when tenants are external customers, compliance domains differ (PCI vs non-PCI), or blast radius from a kernel exploit is unacceptable. The cost is operational: N clusters multiplies your upgrade, certificate rotation, and monitoring burden by N.
- Virtual clusters (vcluster): A middle ground — a fully functional Kubernetes API server running inside a namespace of a host cluster, with its own etcd and control plane, but sharing the host's node pool. Workloads run as regular pods on host nodes. Tenants see a real cluster; the platform team manages one set of nodes.
Namespaces: The Foundation of Soft Multi-Tenancy
A Kubernetes namespace is a logical partition of the API server's object store. Most namespaced resources (Pods, Services, ConfigMaps, Secrets, Deployments, ServiceAccounts) are scoped to a namespace; a handful are cluster-scoped (Nodes, PersistentVolumes, ClusterRoles, StorageClasses). Namespaces on their own provide no network isolation and no resource isolation — they are purely a labeling and RBAC scoping mechanism. The enforcement primitives are layered on top.
ResourceQuotas: Hard Limits on Namespace Consumption
A ResourceQuota object enforces aggregate limits on a namespace. Once a quota is set, every new Pod must have explicit requests and limits defined — the admission webhook will reject any Pod that omits them. This is intentional: it prevents teams from accidentally starving the cluster by deploying quota-less workloads.
LimitRanges: Per-Pod and Per-Container Defaults
Quotas enforce aggregate totals. LimitRange objects enforce per-resource bounds — minimum, maximum, and default values applied automatically when a container does not specify its own. Without a default LimitRange, a team that forgets to set requests/limits would fail at quota admission, causing confusing errors. With a LimitRange default, they get sensible limits injected automatically.
Network Isolation with NetworkPolicies
By default, all pods in a cluster can reach all other pods, regardless of namespace. For multi-tenancy, you need to lock this down. The standard pattern is a default-deny policy per namespace, then explicit allow rules for permitted traffic. Without this, a compromised pod in one namespace can reach databases in every other namespace.
NetworkPolicy enforcement and test with a connectivity probe after applying a deny policy.
RBAC Scoping: One ServiceAccount per Team
Each tenant namespace should have its own ServiceAccount, a Role granting the minimum permissions that team needs, and a RoleBinding that ties the team's group (from your IdP — Okta, Google Workspace, Azure AD) to that Role. Never bind the cluster-admin ClusterRole to a team — they should not be able to read Secrets in other namespaces or mutate Nodes.
Production Failure Modes
- Missing LimitRange + Missing Quota request: A Pod with no resource requests gets scheduled on any node. On a node running 40 such pods, one Pod bursting to 2 CPU starves 39 others. Always have both a LimitRange default and a ResourceQuota in every tenant namespace.
- Over-provisioned quotas: Teams often request 10x what they need "just in case." The cluster then appears full even though average utilization is 20%. Enforce a quota review process: start conservative, increase on evidence of need, and use
kubectl describe quota -n <ns>weekly to right-size. - Lateral movement via shared ServiceAccount tokens: Pods in namespace A that can call the Kubernetes API and have a permissive ClusterRole can list Secrets in namespace B. Audit every ClusterRoleBinding monthly — they are the most common privilege-escalation path in shared clusters.
- NodePort abuse: A tenant who can create a Service of type NodePort can expose a port on every node in the cluster, bypassing NetworkPolicies. Use an OPA/Gatekeeper policy to block NodePort and LoadBalancer Services in tenant namespaces unless explicitly approved.
Structuring Namespaces at Scale
At big-tech scale (hundreds of namespaces), a flat kubectl get namespaces list becomes unmanageable. The mature pattern is a namespace hierarchy enforced via a platform tool:
- Hierarchical Namespace Controller (HNC): Developed by Google, now a Kubernetes SIG project. Lets you define a tree of namespaces (
org > division > team > environment) and propagate objects (RoleBindings, NetworkPolicies, LimitRanges) down the tree automatically. Changes at the division level fan out to all team namespaces beneath it. - Labels as the source of truth: Every namespace carries standardized labels (
team,env,cost-center,tier). Gatekeeper policies and Kyverno rules select namespaces by label, so your platform team can apply security policies to allenv=productionnamespaces with a single ClusterPolicy.
kubectl directly to create namespaces — they open a pull request.