Open Policy Agent & Rego
Open Policy Agent & Rego
Open Policy Agent (OPA) is the de-facto standard for policy enforcement across the cloud-native stack. Originally built at Styra and donated to the CNCF, it graduated to production-ready status in 2021 and is now embedded inside Kubernetes admission controllers (Gatekeeper), service meshes (Istio AuthorizationPolicy integration), API gateways (Kong, Envoy ext_authz), Terraform plan validation (Conftest), and dozens of commercial products. The central idea is decoupled policy: instead of baking access logic into every service, you push all decisions to a single, independently deployable engine that speaks one language — Rego.
OPA Architecture
OPA runs as a sidecar process or standalone service. Consumers send a JSON query (what do you want to know?) together with a JSON input (the thing being evaluated). OPA evaluates the query against its loaded policy bundle and the optional data document (context such as user roles or allowlists), then returns a JSON decision. Nothing about that exchange is Kubernetes-specific.
Three deployment patterns cover essentially all production use cases:
- Sidecar: OPA runs as a container alongside every application pod. Low latency (loopback), isolated blast radius, slightly higher resource overhead. Used by Envoy ext_authz, Linkerd, and Consul.
- Admission webhook (Gatekeeper): OPA runs inside the Kubernetes control plane as a
ValidatingAdmissionWebhook. Every write to the API server is synchronously evaluated before the resource is persisted. This is the primary Kubernetes enforcement point. - Daemon / standalone: A central OPA service evaluates queries from many callers. Appropriate when you want a single auditable decision log. Must be treated as a critical-path dependency: if it goes down, every caller either fails open or closed depending on configuration.
Rego: The Policy Language
Rego is a declarative, logic-based language purpose-built for policy. It is not imperative — you do not write if/else trees and mutate state. Instead, you write logical relationships that the Rego evaluator resolves to a value. The model is closest to Datalog (a subset of Prolog), which makes Rego feel unfamiliar to engineers who have only written procedural code. The payoff is that Rego policies are provably consistent: the same input always produces the same output, and the evaluator can explain exactly why a decision was reached.
Rego Fundamentals: Rules, References, Comprehensions
A Rego file is a module with a package declaration. Rules are equations. The body of a rule is a set of expressions that must all be true (conjunctive logic — AND). Multiple rules with the same head are alternatives (disjunctive logic — OR). You query a value via dot-notation on the document tree (input.spec.containers[_]).
Key Rego idioms that every engineer working with OPA must know:
- The wildcard iterator
[_]: iterates over all elements of an array.input.spec.containers[_].imageis true for any container whose image satisfies the rest of the rule. some x in collection: the modern idiomatic alternative (requiresimport future.keywords.in). Prefer this over[_]in new policies for readability.- Negation
not: true when the expression in the body is undefined or false.not input.metadata.labels.teamis true when theteamlabel is absent — but also when it is explicitly set tonull. Understand this before writing security rules. - Default rules:
default allow = falseprovides a safe fallback for partial rules. Always declare defaults on security-relevant rules. - Partial rules and comprehensions:
violation[msg] { ... }builds a set of all matching messages. The Gatekeeper pattern relies on this — it collects all violations from a resource in one query rather than stopping at the first.
Writing a Production-Quality Policy
The following policy enforces three common Kubernetes guardrails in a single module: required labels, prohibited image registries, and the absence of root user containers. This is representative of what a real platform team ships.
Testing Rego Policies with opa test
A Rego policy without tests is a liability. OPA ships a built-in test runner. Test files live alongside policy files and follow the naming convention *_test.rego. Tests are regular Rego rules whose names start with test_. Any test that evaluates to false or is undefined is a failure.
opa eval interactively during policy development. Use opa eval -d policy/ -i input.json 'data.platform.admission.baseline.violation' to see exactly what your policy returns for a given input before deploying it. The --explain full flag prints the full evaluation trace — invaluable when debugging a rule that fires unexpectedly or not at all. At scale, Styra DAS and the OPA VS Code extension add IDE-level coverage analysis.
OPA Bundle Distribution
In production, policies are not baked into the OPA binary. They are distributed as bundles — tar.gz archives of .rego files and data.json documents — pushed to an OCI registry or HTTPS endpoint. OPA polls the bundle endpoint at a configured interval and hot-reloads policies without a restart. This is what makes policy updates a CI operation: merge to main, CI builds and pushes the new bundle, OPA clusters pick it up within seconds.
false. OPA callers that treat undefined as allow (common in hand-rolled integrations) create a bypass: a malformed input that fails to match any rule will be silently permitted. Always use default allow = false or configure your integration to treat undefined decisions as deny. Gatekeeper handles this correctly; double-check any custom OPA REST API caller you write.
OPA in the Kubernetes Admission Path
When you deploy Gatekeeper (covered in the next lesson), OPA is the evaluation engine behind the scenes. Understanding the raw OPA query/response cycle is essential for debugging Gatekeeper failures, writing custom constraint templates, and integrating OPA into non-Kubernetes systems. The Kubernetes API server sends an AdmissionReview JSON object as the input; the policy must produce a violation set; Gatekeeper maps that set to an admission response. Every opa eval call you run locally with a real AdmissionReview fixture reproduces exactly what Gatekeeper will do in production.