Secrets Management & PKI

HashiCorp Vault Architecture

18 min Lesson 3 of 28

HashiCorp Vault Architecture

HashiCorp Vault is the de facto secrets management platform across the industry — used by Google, Netflix, GitHub, and thousands of other engineering organisations. Before you run a single vault kv get, you need to understand what Vault is doing internally. The architectural model — seal/unseal, auth methods, secret engines, and policies — directly determines your threat model, your blast radius if something goes wrong, and your operational runbook. This lesson tears open the internals.

The Storage Backend and the Encryption Barrier

Vault itself stores nothing in plaintext. Every piece of data — secrets, tokens, leases, auth configuration — is encrypted before it hits the storage backend. The storage backend (Raft integrated storage in modern deployments, or Consul/DynamoDB in older ones) is intentionally kept dumb: it only stores opaque encrypted blobs. This separation is the foundation of Vault's security model. Even if your Raft snapshot leaks, an attacker gets encrypted ciphertext, not secrets.

The key that encrypts all data is called the encryption key. That encryption key itself is encrypted by a master key. The master key is split via Shamir's Secret Sharing into N shards (default: 5), of which M are required to reconstruct it (default: 3). Those shards are distributed to operators — this is the unseal process.

Seal and Unseal

When Vault starts (or after a restart, crash, or snapshot restore), it is in a sealed state. In this state, Vault can talk to its storage backend, but it cannot decrypt any data. It does not know the master key. It refuses all API requests. A sealed Vault is cryptographically locked.

Unsealing is the process of feeding enough key shards to Vault so it can reconstruct the master key, decrypt the encryption key, and begin serving requests. With the default Shamir setup, three of five operators each provide their shard — the master key is never assembled in full outside Vault's memory. Once unsealed, Vault holds the encryption key in memory only; it is never persisted to disk in plaintext.

Key idea — Auto-unseal: Manual Shamir unseal is unacceptable at production scale (imagine a node restart at 3 AM requiring three operators). The production answer is Auto Unseal: Vault delegates master key protection to an external KMS — AWS KMS, GCP Cloud KMS, or Azure Key Vault. On startup, Vault calls the KMS to decrypt the locally stored wrapped master key and unseals automatically. The KMS key policy becomes your true security boundary.

# Initialise a new Vault cluster (run once, captures unseal keys and root token)
vault operator init \
  -key-shares=5 \
  -key-threshold=3 \
  -format=json | tee /tmp/vault-init.json

# Manual unseal (repeat with 3 different key shards)
vault operator unseal <UNSEAL_KEY_1>
vault operator unseal <UNSEAL_KEY_2>
vault operator unseal <UNSEAL_KEY_3>

# Verify seal status
vault status

# Configure Auto Unseal with AWS KMS (goes in Vault server config HCL, not CLI)
# /etc/vault.d/vault.hcl
seal "awskms" {
  region     = "us-east-1"
  kms_key_id = "alias/vault-unseal-key"
  # IAM role on the EC2 instance or ECS task provides credentials — no static keys
}

Production pitfall — the root token: vault operator init prints a root token. Root tokens bypass all policies — they are the equivalent of the UNIX root account. Store the root token in a hardware-backed secret store immediately after init, revoke it as soon as you have configured operational auth methods (vault token revoke <root-token>), and regenerate it only for emergency break-glass scenarios using vault operator generate-root with quorum. Running day-to-day operations with a root token is an audit finding at every major company.

Auth Methods

Auth methods answer the question: How does a caller prove its identity to Vault? Vault is entirely agnostic about where callers come from — it supports pluggable auth methods that map external identities to Vault policies. The output of any successful auth method is a short-lived Vault token with attached policies. Once a caller has that token, the auth method is no longer in the picture.

The most important auth methods in production engineering:

AppRole — designed for machines and CI pipelines. A Role ID (non-secret, embedded in config) plus a Secret ID (secret, injected at runtime by a trusted orchestrator) together prove identity. Netflix popularised this pattern for service-to-service secrets bootstrap.
Kubernetes — the service account JWT from a Pod is presented to Vault. Vault calls the Kubernetes API to verify the JWT, checks it matches the expected namespace/service account, and issues a token. Zero static credentials required in the Pod.
AWS IAM — a signed GetCallerIdentity request proves the caller is a specific IAM role or user. Works for EC2, Lambda, ECS, and anywhere IAM credentials exist. No secrets in your AMI or container image.
OIDC / JWT — used to authenticate GitHub Actions, GitLab CI, and any OIDC-capable identity provider. The CI job's OIDC token is validated against the provider's JWKS endpoint.
Token — the built-in method. Every Vault token can create child tokens. Used for bootstrapping and humans via the CLI. Not for production workloads.

# Enable Kubernetes auth method
vault auth enable kubernetes

# Configure it to validate against the cluster's API server
vault write auth/kubernetes/config \
  kubernetes_host="https://kubernetes.default.svc:443" \
  token_reviewer_jwt="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \
  kubernetes_ca_cert="$(cat /var/run/secrets/kubernetes.io/serviceaccount/ca.crt)"

# Create a role: map a Kubernetes service account to a Vault policy
vault write auth/kubernetes/role/payments-api \
  bound_service_account_names=payments-api \
  bound_service_account_namespaces=production \
  policies=payments-read \
  ttl=1h

# From inside the Pod, authenticate and retrieve a token
vault write auth/kubernetes/login \
  role=payments-api \
  jwt="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"

Secret Engines

Secret engines are Vault's pluggable backends for generating and managing secrets. This is the most powerful aspect of Vault: rather than being a static key-value store, Vault can dynamically generate secrets that expire automatically. The secret engine mounted at a given path handles all operations at that path.

Core secret engines you will use in production:

KV v2 (kv-v2) — versioned key-value store. Every write creates a new version. Supports check-and-set to prevent write conflicts. Used for static secrets like API keys, OAuth client secrets, and third-party credentials that cannot be dynamically generated.
Database — connects to your database and dynamically generates short-lived credentials (username + password) for each caller. The credentials are automatically revoked when the lease expires. Works with PostgreSQL, MySQL, MongoDB, Redis, Elasticsearch, and more. Eliminates shared database passwords entirely.
AWS — generates short-lived AWS IAM access keys, or assumes roles and returns STS credentials. A Lambda function gets a unique AWS key pair that auto-expires in 15 minutes, rather than sharing a long-lived key embedded in environment variables.
PKI — a full certificate authority built into Vault. Issues X.509 certificates on demand. Each service gets its own short-lived TLS cert (minutes to hours), making certificate compromise nearly irrelevant because they expire before an attacker can use them. This is the basis of Lesson 8.
Transit — encryption-as-a-service. Vault holds the key and never returns it; callers send plaintext and receive ciphertext (or vice versa). Used to encrypt database columns, message payloads, or PII without distributing the encryption key to every service.

# Mount and configure the Database secret engine for PostgreSQL
vault secrets enable -path=database database

vault write database/config/payments-db \
  plugin_name=postgresql-database-plugin \
  allowed_roles="payments-readonly,payments-readwrite" \
  connection_url="postgresql://{{username}}:{{password}}@postgres.internal:5432/payments" \
  username="vault-root" \
  password="<initial-root-pw>"

# Define a role: what SQL runs when Vault creates a credential
vault write database/roles/payments-readonly \
  db_name=payments-db \
  creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; \
    GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
  default_ttl="1h" \
  max_ttl="24h"

# A service fetches a dynamic credential
vault read database/creds/payments-readonly
# Returns: username=v-payments-1a2b3c4d  password=A1b2C3d4...  lease_duration=1h

Policies

Vault policies are the authorisation layer — they define what an authenticated identity can do. Policies are written in HCL (or JSON) and express path-based access control. Every capability (read, write, delete, list, create, update, patch, sudo) must be explicitly granted; Vault denies by default.

Policies are attached to tokens at auth time. A token can have multiple policies — the effective permission set is the union of all policies. The root policy is implicitly attached to root tokens only and bypasses all checks.

# payments-read.hcl — least-privilege policy for the payments-api service
path "secret/data/payments/*" {
  capabilities = ["read"]
}

path "database/creds/payments-readonly" {
  capabilities = ["read"]
}

# Deny access to everything else (implicit, but explicit denial is possible)
path "secret/data/payments/admin/*" {
  capabilities = ["deny"]
}

# Write the policy to Vault
vault policy write payments-read payments-read.hcl

# Inspect effective policies on a token
vault token lookup <TOKEN>

# Test what a policy allows WITHOUT a real token (invaluable for CI policy linting)
vault policy fmt payments-read.hcl          # auto-format

Pro practice — policy-as-code: Store all policy HCL files in Git and apply them via CI, exactly as you would Terraform. This gives you change history, peer review, and the ability to diff policy changes before they go live. Companies like GitHub and Shopify gate every Vault policy change through a pull request + automated vault policy fmt lint check. Never edit policies directly in the Vault UI in production.

Vault architecture: a caller authenticates via an auth method, receives a token with attached policies, the policy engine authorises the request, and the secret engine returns the secret with a lease. All data at rest is encrypted; the KMS enables automatic unseal.

Lease, Renewal, and Revocation

Every secret returned by a dynamic secret engine comes with a lease: a TTL after which Vault automatically revokes the credential at the source (e.g., drops the database user). This is the core operational advantage over static secrets — even if a credential leaks, it expires in minutes or hours rather than living forever until someone manually rotates it.

Applications that need secrets beyond the initial TTL must renew the lease before it expires. Vault Agent (a sidecar process) handles renewal transparently. When a service is decommissioned, an operator can immediately revoke all leases issued under a specific auth token or role, instantly invalidating every database credential and API key that service held — impossible with static secrets.

Pro practice — Vault Agent Injector in Kubernetes: Never write Vault SDK calls in your application code to fetch secrets at startup. Instead, use the Vault Agent Injector (a Kubernetes mutating webhook). It sidecars a Vault Agent into your Pod that authenticates via Kubernetes auth, fetches secrets, writes them to a shared tmpfs volume as files, and keeps them renewed. Your application reads a file — no Vault dependency in application code, no secrets in environment variables, no secrets in image layers.

High Availability and Raft

Production Vault runs as a cluster. The recommended storage backend since Vault 1.4 is integrated Raft storage: a built-in Raft consensus protocol that replicates data across cluster nodes. A typical production deployment is 3 or 5 nodes behind an internal load balancer. Only the active node handles writes; standby nodes proxy read/write requests to the active node or (with performance standbys) serve reads locally.

Raft snapshots are the disaster recovery mechanism. Automate a daily vault operator raft snapshot save to encrypted S3 or GCS. When a cluster is lost, bootstrap a new cluster, restore the snapshot, and unseal. With Auto Unseal via KMS, this process takes under 10 minutes.

Vault Enterprise vs. OSS: The Vault you install from the HashiCorp registry is open source and covers everything in this lesson. Vault Enterprise adds features that large enterprises pay for: namespaces (multi-tenancy), replication (multi-region active-active), HSM seal integration, sentinel policies (fine-grained OPA-style rules), and audit device forwarding. Most companies at startup-to-mid-stage scale run OSS Vault on three Raft nodes with Auto Unseal — that covers 90% of production needs.