Compliance & Policy as Code

Controls & Evidence

18 min Lesson 2 of 27

Controls & Evidence

Auditors do not trust assertions — they trust evidence. When a SOC 2 auditor, a PCI QSA, or an internal security team asks "how do you know your systems are compliant?", the answer cannot be "we follow best practices." It must be a timestamped, tamper-evident artifact that proves a specific control operated correctly at a specific time. This lesson teaches you to think like that auditor, design control frameworks that are auditable by construction, and automate the evidence pipeline so that the work of compliance does not land entirely on your team the week before an audit.

What Is a Control?

A control is a safeguard or countermeasure that addresses a specific risk. Every major compliance framework — SOC 2, ISO 27001, PCI DSS, HIPAA, FedRAMP — is ultimately a catalog of controls, each with a control objective: a statement of the desired state the control is meant to maintain. Understanding this vocabulary is prerequisite to automating compliance work.

Control objective — the outcome the control must achieve. Example: "All administrative access to production databases must require multi-factor authentication." This is the what.
Control activity — the specific action or configuration that achieves the objective. Example: "The RDS cluster is in a VPC with no public endpoint; access is through an IAM-authenticated bastion that enforces MFA via hardware token." This is the how.
Evidence — the artifact that proves the control operated as designed during a given period. Example: CloudTrail logs showing every DB connection originated from the bastion ARN, IAM Access Analyzer confirming no public access, and the bastion's MFA enforcement policy version history.

In practice, controls are classified by their nature. Preventive controls stop bad things from happening (an IAM policy that denies s3:PutBucketPublicAccessBlock false). Detective controls notice when something went wrong (a CloudWatch alarm when a security group rule is added with 0.0.0.0/0). Corrective controls repair the damage automatically (an AWS Config auto-remediation that removes a public bucket ACL within seconds of detection). A mature compliance program layers all three.

Control Taxonomy Matters for Auditors: When you present evidence, organize it by control type. Auditors verify that each control objective has at least one preventive or detective control backed by evidence spanning the full audit period (typically 12 months for SOC 2 Type II). A gap in evidence coverage — even a single week — can require re-testing or a qualified opinion.

Control Objectives in Engineering Terms

The gap between a compliance framework's abstract language and what an engineer actually configures is where most compliance debt accumulates. Let us walk through a concrete mapping for a common SOC 2 Common Criteria control — CC6.6, which requires that logical access is removed in a timely manner when no longer authorized:

Objective: Deprovision access within 24 hours of termination.
Engineering control: SCIM provisioning between your HR system (Workday, BambooHR) and your IdP (Okta, Azure AD). Okta deactivates the account upon the HR system's offboarding webhook. Okta is federated to AWS, GCP, and all SaaS tools via SAML/OIDC — deactivating the Okta account cascades access removal everywhere.
Evidence artifacts: (1) Okta system log export showing the deactivation event and timestamp, (2) AWS CloudTrail showing the last console login before deactivation, (3) a monthly Okta inactive-user report showing no active accounts for terminated employees, (4) your SCIM configuration screenshot proving the integration is live.

This is the translation work. Every control in your framework needs this same decomposition — objective, engineering implementation, and a list of the evidence artifacts that prove it operated correctly during the audit period.

Automating Evidence Collection

Manual evidence collection is the hidden cost of compliance. Teams that collect evidence by hand — exporting CSVs, screenshotting dashboards, writing one-off scripts — spend weeks before every audit in a fire drill. At Google-scale, the alternative is a continuous evidence pipeline: infrastructure that generates compliance artifacts as a natural side-effect of normal operations, stores them immutably, and feeds them into an audit-ready repository.

The components of such a pipeline are:

Event sources — CloudTrail, AWS Config, GCP Audit Logs, Kubernetes Audit Log, Okta System Log, GitHub Audit Log. Every privileged action produces a machine-readable event.
Central log sink — S3 + Athena, GCS + BigQuery, or a SIEM (Splunk, Elastic). Logs are written with object-lock (WORM — Write Once Read Many) so they cannot be altered retroactively.
Evidence queries — pre-written SQL or Athena queries that answer specific control questions on demand.
Scheduled evidence jobs — cron jobs (or Lambda functions triggered by EventBridge Scheduler) that run the queries and write the output as a dated artifact in an evidence bucket.
Evidence registry — a metadata file (often a YAML manifest in a Git repo) that maps each control to its evidence artifacts, owner, and last-verified timestamp.

The continuous evidence pipeline: event sources write to a tamper-proof (WORM) sink, a query engine produces dated evidence artifacts on schedule, and a control registry maps each artifact to its control objective.

Practical: AWS Config + Athena Evidence Query

AWS Config continuously records the configuration state of every resource. Combined with Athena, you can query that history to answer specific control questions. The following setup ships evidence for a control like "no S3 bucket has public access enabled" covering any 90-day window — queryable on demand.

# 1. Enable AWS Config with an S3 delivery channel
aws configservice put-configuration-recorder \
  --configuration-recorder name=default,roleARN=arn:aws:iam::123456789012:role/AWSConfigRole \
  --recording-group allSupported=true,includeGlobalResources=true

aws configservice put-delivery-channel \
  --delivery-channel name=default,s3BucketName=my-config-bucket,\
configSnapshotDeliveryProperties={deliveryFrequency=TwentyFour_Hours}

aws configservice start-configuration-recorder --configuration-recorder-name default

# 2. Query config history via Athena (run after setting up the Config table in Glue)
# This query finds any point in time when an S3 bucket had public access enabled
# -- evidence of absence for the "no public buckets" control objective
SELECT
  accountId,
  awsRegion,
  resourceId AS bucket_name,
  configuration_item_capture_time,
  JSON_EXTRACT_SCALAR(configuration, '$.publicAccessBlockConfiguration.blockPublicAcls') AS block_acls,
  JSON_EXTRACT_SCALAR(configuration, '$.publicAccessBlockConfiguration.blockPublicPolicy') AS block_policy
FROM aws_config_configuration_items
WHERE resourceType = 'AWS::S3::Bucket'
  AND configuration_item_capture_time BETWEEN TIMESTAMP '2025-01-01' AND TIMESTAMP '2025-03-31'
  AND (
    JSON_EXTRACT_SCALAR(configuration, '$.publicAccessBlockConfiguration.blockPublicAcls') = 'false'
    OR JSON_EXTRACT_SCALAR(configuration, '$.publicAccessBlockConfiguration.blockPublicPolicy') = 'false'
  )
ORDER BY configuration_item_capture_time;

If the query returns zero rows, you have machine-generated evidence that no S3 bucket had public access during that quarter. Export the result as a CSV, store it in s3://evidence-bucket/cc6.1/2025-q1-no-public-s3.csv, and record that path in your control registry. That is one control, fully evidenced, with zero manual effort at audit time.

Store Evidence with a Consistent Naming Convention: Use a path scheme like s3://evidence/<framework>/<control-id>/<YYYY-MM>-<description>.<ext>. This makes it trivial to retrieve all evidence for a given control across all months, and auditors can self-serve without asking your team for files. Pair it with a Git-tracked controls.yaml manifest that links control ID to evidence S3 paths, owner, and last review date.

Kubernetes Audit Log Evidence

In a Kubernetes environment, the API server audit log is the equivalent of CloudTrail. Every kubectl exec, every secret read, every privilege escalation attempt is recorded. For compliance controls like "no privileged containers run in production" or "only authorized users can exec into pods," the audit log is your evidence source.

# Audit policy snippet — enable in kube-apiserver flags:
# --audit-policy-file=/etc/kubernetes/audit-policy.yaml
# --audit-log-path=/var/log/kubernetes/audit.log
# --audit-log-maxage=90
# --audit-log-maxbackup=10
# --audit-log-maxsize=500

# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  # Log exec and port-forward at RequestResponse level (captures input/output)
  - level: RequestResponse
    resources:
    - group: ""
      resources: ["pods/exec", "pods/portforward"]

  # Log secret reads — evidence for "secrets access is audited"
  - level: Metadata
    resources:
    - group: ""
      resources: ["secrets", "configmaps"]
    verbs: ["get", "list", "watch"]

  # Log all cluster-scoped resource mutations
  - level: Request
    verbs: ["create", "update", "patch", "delete"]
    resources:
    - group: ""
      resources: ["nodes", "namespaces"]
    - group: "rbac.authorization.k8s.io"
      resources: ["clusterroles", "clusterrolebindings"]

  # Drop noisy read-only calls to reduce volume
  - level: None
    verbs: ["get", "list", "watch"]
    resources:
    - group: ""
      resources: ["endpoints", "events"]

Audit Log Retention is a Control in Itself: Most frameworks require audit logs to be retained for a minimum period — PCI DSS requires 12 months (3 months immediately available), SOC 2 auditors typically want the full audit window available. If you set --audit-log-maxage=30 or allow your CloudTrail S3 bucket to expire logs after 30 days, you will fail the evidence retention control even if every other control is perfect. Set retention to at minimum your audit period plus 30 days buffer, ship logs to an object-locked bucket, and add a Config rule or a Lambda that alerts if the bucket lifecycle policy is changed.

Building a Control Registry

A control registry is the authoritative mapping between compliance requirements and the engineering controls that satisfy them. At scale, it lives in a Git repository alongside your Infrastructure as Code — this makes control changes go through pull request review, gives you a history of when controls changed, and lets your CI pipeline validate that referenced evidence artifacts actually exist.

A minimal controls.yaml entry looks like this — one entry per control objective:

# controls.yaml — checked into the compliance Git repository
controls:
  - id: CC6.6
    framework: SOC2-2022
    title: "Logical access is removed within 24 hours of termination"
    type: detective
    owner: identity-team
    implementation: |
      SCIM sync between Workday and Okta. Okta deactivates account
      on offboarding webhook. Federated to AWS & GCP via SAML.
    evidence:
      - description: "Okta system log export — deactivation events"
        bucket: s3://evidence/soc2/cc6.6/
        schedule: monthly
        last_generated: "2025-05-01"
        query_file: queries/cc6_6_okta_deactivations.sql
      - description: "Monthly inactive user report"
        bucket: s3://evidence/soc2/cc6.6/
        schedule: monthly
        last_generated: "2025-05-01"
    last_reviewed: "2025-05-15"
    review_owner: platform-security-eng
    status: passing

  - id: CC6.1-S3
    framework: SOC2-2022
    title: "No S3 bucket exposes objects publicly"
    type: preventive
    owner: platform-team
    implementation: |
      AWS Config rule S3_BUCKET_PUBLIC_ACCESS_PROHIBITED enforced.
      SCPs block s3:PutBucketPublicAccessBlock with false values.
    evidence:
      - description: "Athena query — no public buckets in quarter"
        bucket: s3://evidence/soc2/cc6.1/
        schedule: quarterly
        last_generated: "2025-04-01"
        query_file: queries/cc6_1_no_public_s3.sql
    last_reviewed: "2025-04-15"
    review_owner: platform-security-eng
    status: passing

This registry, automated evidence generation scripts in queries/, and a CI job that validates last_generated dates are within the required window together form the operational foundation of a continuously-compliant system. When the auditor arrives, you run a script, not a fire drill.

GRC Tooling vs. Git-Native Approach: Enterprise GRC (Governance, Risk, and Compliance) platforms such as Drata, Vanta, and Secureframe automate much of this evidence collection for common SaaS integrations (AWS, GitHub, Okta, Google Workspace). They are worth evaluating for SOC 2. However, for custom infrastructure controls — especially Kubernetes, custom Terraform, or non-standard tooling — you will always need the skills taught here: defining control objectives precisely, identifying evidence sources, and writing queries that produce irrefutable artifacts. GRC tools complement but do not replace engineering judgment.