DevSecOps & Supply Chain Security

Secrets Scanning & Leak Response

18 min Lesson 9 of 28

Secrets Scanning & Leak Response

A hardcoded AWS access key, a Stripe secret, a database password committed to a public repository — these are the most common, most preventable, and most damaging categories of security incident in modern software delivery. When the Uber breach of 2022 was traced back to credentials stored in a private GitHub repository, and when thousands of GitHub repositories are automatically scraped by bots within seconds of a push, the question is not whether to scan for secrets — it is how many layers of defense you run.

This lesson covers the full lifecycle: blocking secrets before they ever enter version control, catching anything that slips through in CI, alerting on leaks in existing history, and executing the incident response runbook when a key does escape into the wild.

Layer 1: Pre-Commit Hooks with detect-secrets and Gitleaks

The cheapest time to catch a secret is before the commit is created. Gitleaks and detect-secrets are the two dominant tools, and most mature teams run both — Gitleaks for its speed and broad regex coverage, detect-secrets for its entropy-based detection and baseline audit workflow.

The canonical setup uses pre-commit (the Python framework) to orchestrate hooks from a single .pre-commit-config.yaml in the repository root:

# .pre-commit-config.yaml  (commit this to every repository)
repos:
  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.18.4
    hooks:
      - id: gitleaks
        # Fails the commit if any secret pattern matches
        # Config sourced from .gitleaks.toml in repo root

  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.5.0
    hooks:
      - id: detect-secrets
        args: ['--baseline', '.secrets.baseline']
        # First run: detect-secrets scan > .secrets.baseline
        # Commit the baseline; only NET-NEW secrets fail future commits

Install and activate with pip install pre-commit && pre-commit install. On macOS developer machines you can also distribute via Homebrew. For repositories where you cannot guarantee every contributor runs pre-commit install, the hook is a speed bump, not a guarantee — which is why CI scanning is mandatory.

Gitleaks reads a .gitleaks.toml config for allowlists and custom rules. The most important tuning is a project-specific allowlist so that test fixtures with fake credentials (e.g., EXAMPLE_KEY=AKIAIOSFODNN7EXAMPLE) do not generate noise:

# .gitleaks.toml
[allowlist]
  description = "Intentional test fixtures and documentation examples"
  regexes = [
    '''EXAMPLE''',
    '''AKIAIOSFODNN7EXAMPLE''',
    '''my-placeholder-secret''',
  ]
  paths = [
    '''tests/fixtures/.*''',
    '''docs/.*''',
  ]

[[rules]]
  id = "internal-api-token"
  description = "Internal deployment token pattern"
  regex = '''deploy_token_[a-z0-9]{32}'''
  tags = ["secret", "internal"]
  severity = "CRITICAL"

Baseline vs blocking: On a greenfield repository, run Gitleaks in blocking mode from day one. On a repository with existing history, generate a detect-secrets baseline first (detect-secrets scan > .secrets.baseline) so that pre-existing issues in the baseline do not block all commits — only net-new secrets trigger failures. Schedule a separate audit job to work down the baseline backlog.

Layer 2: CI Pipeline Scanning

CI is the mandatory enforcement gate. Even if every developer dutifully runs pre-commit, CI scanning catches force-pushes, web editor commits, and GitHub Actions bot commits that bypass local hooks. Run Gitleaks on the full diff for every pull request, and on every push to protected branches. The --source flag scans only the commits introduced in the PR, keeping scan time under five seconds for most repositories.

# .github/workflows/secret-scan.yml
name: Secret Scan

on:
  push:
    branches: ["main", "master", "release/**"]
  pull_request:

jobs:
  gitleaks:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0          # full history needed for history scan

      - name: Run Gitleaks (PR diff only)
        uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          GITLEAKS_LICENSE: ${{ secrets.GITLEAKS_LICENSE }}  # for org repos
        with:
          args: --source . --log-opts="origin/main..HEAD" --report-format sarif --report-path gitleaks.sarif

      - name: Upload SARIF to GitHub Security
        if: always()
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: gitleaks.sarif

GitHub's own Secret Scanning feature (available on all public repositories and GitHub Advanced Security) runs continuously in the background and sends push-protection alerts before a commit lands on the default branch. Enable it under Settings → Code security → Secret scanning → Push protection. It covers 200+ partner token patterns (AWS, GCP, Stripe, Twilio, Slack, etc.) with near-zero false-positive rates because the patterns are validated against the provider's live API.

Layer 3: Scanning Existing History

When you onboard an existing repository, you must scan its entire commit history — not just the current HEAD. Secrets deleted in a later commit are still in the git object store, visible to anyone who clones the repository. Use gitleaks detect --source . --log-opts="--all" or truffleHog for deep history scans. This is a one-time audit job; the results feed a remediation backlog tracked in your security system (Jira, GitHub Issues, or a dedicated secret management dashboard).

Defense-in-depth: three scanning layers plus an incident response runbook ensure a leaked secret is caught and contained at the earliest possible stage.

Incident Response: The Five-Step Runbook

When a secret leaks — detected by a scanner, reported by a bug bounty researcher, or discovered in a breach notification — speed is everything. Bots scan GitHub in real time and have been measured consuming newly pushed AWS keys within four seconds of exposure. Every minute of delay expands the blast radius.

Revoke immediately. Do not wait for confirmation of exploitation. Go to the credential provider's console (AWS IAM, GCP IAM, Stripe Dashboard, GitHub Settings) and deactivate or delete the key. If the key controls infrastructure, accept the brief outage — it is less damaging than a breach. For AWS keys: aws iam delete-access-key --access-key-id AKIA.... For GitHub tokens: revoke in Settings → Developer settings → Personal access tokens.
Rotate and issue a replacement. Create a new credential immediately and inject it into your secrets manager (HashiCorp Vault, AWS Secrets Manager, or GCP Secret Manager). Update all systems consuming the old key. Never reuse the compromised key material.
Rewrite git history (or accept the exposure). Removing a secret from the current commit does NOT remove it from git history — anyone who cloned or cached the repo already has it. Use BFG Repo-Cleaner or git filter-repo to rewrite history, then force-push and notify all collaborators to re-clone. This is disruptive on active repositories; weigh it against the risk profile of the leaked credential.
Audit access logs. Pull CloudTrail logs (AWS), Stackdriver audit logs (GCP), or equivalent. Look for API calls made with the compromised key from unexpected IPs, regions, or at unusual times. This determines whether exploitation occurred and what resources were accessed.
Root cause and systemic fix. Write a blameless post-mortem: how did the secret get committed? Was a pre-commit hook missing? Was a developer bypassing hooks with --no-verify? Was a CI secret accidentally echoed to logs? Address the systemic gap — not just this instance.

Never use git commit --amend or soft-reset to "remove" a secret. The original commit object remains in the git object store and is reachable by anyone with a clone or a GitHub cache until a complete history rewrite. Assume the secret is fully public the moment it was pushed.

Preventing Secrets at the Source: Vault Integration

Scanning catches secrets that slip through. The long-term fix is making it impossible to hardcode secrets because they never exist as plaintext in the developer's environment. This means every application reads credentials from a secrets manager at runtime — not from .env files, not from environment variables baked into container images, and certainly not from source code.

In a Kubernetes environment, the canonical pattern is External Secrets Operator (ESO), which syncs secrets from Vault or AWS Secrets Manager into Kubernetes Secret objects that the pod consumes as environment variables or mounted files. The secret material never passes through the CI pipeline or the container image build.

# ExternalSecret manifest — syncs from AWS Secrets Manager into K8s Secret
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: myapp-db-credentials
  namespace: production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secretsmanager
    kind: ClusterSecretStore
  target:
    name: myapp-db-credentials   # K8s Secret name created/updated by ESO
    creationPolicy: Owner
  data:
    - secretKey: DB_PASSWORD      # key in the K8s Secret
      remoteRef:
        key: production/myapp/db  # path in AWS Secrets Manager
        property: password        # JSON key within the secret

Shift-left secret detection in IDEs: Install the Gitleaks VS Code extension or the GitGuardian IDE plugin. Developers see secret detections inline as they type, before any commit is attempted. Combined with pre-commit hooks and CI scanning, this creates a three-layer pre-production defense that matches how Google and Meta operate internal secret governance.

Secrets scanning is not a one-time audit — it is a continuously running control. The organizations that handle secrets well treat every credential as ephemeral: short-lived, automatically rotated, and never human-readable in production. The scanning layer exists not because the vault pattern failed, but because humans inevitably take shortcuts during incidents, debugging sessions, and local development. Your job is to make those shortcuts visible and correctable before they become breaches.