Secrets Management & PKI
HashiCorp Vault — Architecture
Vault is a secrets management and data protection platform. It provides a unified interface for secret storage, dynamic credentials, encryption-as-a-service, and PKI — all with fine-grained access control and full audit logging.
Core Components
Storage Backends
Vault persists encrypted data to a backend. Only Vault can decrypt it — the backend sees ciphertext only.
- Integrated Raft (recommended) — built-in consensus storage, no external dependency, supports HA natively
- Consul — HashiCorp Consul for storage, requires running Consul cluster
- DynamoDB — AWS DynamoDB for cloud-native storage (does not support HA leader election natively)
- GCS / S3 — object storage backends, single-node only (no HA)
Auth Methods
How clients prove their identity to Vault. Vault validates credentials via the auth method and issues a token with associated policies. Methods: Kubernetes, AWS IAM, GCP IAM, AppRole, LDAP, OIDC, GitHub, TLS certificates, Userpass.
Secret Engines
Plugins that store, generate, or encrypt data. Mounted at paths within Vault. Types: KV (v1/v2), database (dynamic credentials), PKI, AWS, GCP, Azure, SSH, Transit (encryption-as-a-service), TOTP.
Audit Devices
Every Vault operation — including failed auth attempts — is written to all enabled audit devices before the operation completes. If no audit device can be written to, Vault blocks the operation. Types: file, syslog, socket.
High Availability with Integrated Raft
Raft integrated storage provides HA without an external dependency. One Vault node is the active leader; remaining nodes are standbys. Standby nodes can be configured as performance standbys (read requests handled locally, reducing leader load) or regular standbys (forward all reads to the leader).
- Leader election: Raft consensus with write quorum (N/2 + 1 nodes)
- Disaster Recovery replication: replicates all data to a secondary cluster (separate region) for disaster recovery — secondary is read-only warm standby
- Performance replication: replicates data to secondary clusters that serve read traffic, reducing latency for geographically distributed clients
Vault on Kubernetes — Helm Install
Production HA Values
# vault-values.yaml — production HA with Raft and AWS KMS auto-unseal
global:
enabled: true
tlsDisable: false
injector:
enabled: true
replicas: 2
resources:
requests:
memory: 256Mi
cpu: 250m
limits:
memory: 256Mi
cpu: 250m
server:
enabled: true
image:
repository: hashicorp/vault
tag: "1.17.0"
resources:
requests:
memory: 256Mi
cpu: 250m
limits:
memory: 512Mi
cpu: 500m
affinity: |
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: vault
app.kubernetes.io/instance: vault
component: server
topologyKey: kubernetes.io/hostname
extraEnvironmentVars:
VAULT_CACERT: /vault/userconfig/vault-ha-tls/vault.ca
VAULT_TLSCERT: /vault/userconfig/vault-ha-tls/vault.crt
VAULT_TLSKEY: /vault/userconfig/vault-ha-tls/vault.key
volumes:
- name: userconfig-vault-ha-tls
secret:
secretName: vault-ha-tls
volumeMounts:
- mountPath: /vault/userconfig/vault-ha-tls
name: userconfig-vault-ha-tls
readOnly: true
ha:
enabled: true
replicas: 3
raft:
enabled: true
setNodeId: true
config: |
ui = true
cluster_name = "vault-production"
listener "tcp" {
tls_disable = 0
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/vault-ha-tls/vault.crt"
tls_key_file = "/vault/userconfig/vault-ha-tls/vault.key"
tls_client_ca_file = "/vault/userconfig/vault-ha-tls/vault.ca"
}
storage "raft" {
path = "/vault/data"
retry_join {
leader_api_addr = "https://vault-0.vault-internal:8200"
leader_ca_cert_file = "/vault/userconfig/vault-ha-tls/vault.ca"
}
retry_join {
leader_api_addr = "https://vault-1.vault-internal:8200"
leader_ca_cert_file = "/vault/userconfig/vault-ha-tls/vault.ca"
}
retry_join {
leader_api_addr = "https://vault-2.vault-internal:8200"
leader_ca_cert_file = "/vault/userconfig/vault-ha-tls/vault.ca"
}
}
# AWS KMS auto-unseal
seal "awskms" {
region = "us-east-1"
kms_key_id = "alias/vault-unseal-key"
}
service_registration "kubernetes" {}
ui:
enabled: true
serviceType: ClusterIP
# Install Vault via Helm
helm repo add hashicorp https://helm.releases.hashicorp.com
helm repo update
helm install vault hashicorp/vault \
--namespace vault \
--create-namespace \
-f vault-values.yaml
# Initialize Vault (first time only — outputs root token + unseal keys)
kubectl exec -n vault vault-0 -- vault operator init \
-key-shares=5 \
-key-threshold=3 \
-format=json > vault-init.json
# With auto-unseal via AWS KMS, initialization only generates recovery keys
# Manual unseal is not needed after pod restarts
# Join Raft peers
kubectl exec -n vault vault-1 -- vault operator raft join https://vault-0.vault-internal:8200
kubectl exec -n vault vault-2 -- vault operator raft join https://vault-0.vault-internal:8200
# Verify cluster status
kubectl exec -n vault vault-0 -- vault operator raft list-peers
Auto-Unseal — GCP Cloud KMS
# GCP Cloud KMS auto-unseal config in Vault HCL
seal "gcpckms" {
credentials = "/vault/userconfig/gcp-creds/credentials.json"
project = "my-project"
region = "global"
key_ring = "vault-keyring"
crypto_key = "vault-unseal-key"
}
Auth Methods — Deep Dive
Kubernetes Auth
The most common auth method for workloads running in Kubernetes. Vault validates the pod's service account JWT against the Kubernetes API.
# Enable Kubernetes auth
vault auth enable kubernetes
# Configure Kubernetes auth
vault write auth/kubernetes/config \
kubernetes_host="https://kubernetes.default.svc" \
kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt \
token_reviewer_jwt=@/var/run/secrets/kubernetes.io/serviceaccount/token \
issuer="https://kubernetes.default.svc.cluster.local"
# Create a role binding a service account to a Vault policy
vault write auth/kubernetes/role/payment-service \
bound_service_account_names="payment-service" \
bound_service_account_namespaces="production" \
policies="payment-service-policy" \
ttl="1h"
# The application authenticates:
JWT=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
vault write auth/kubernetes/login \
role="payment-service" \
jwt="${JWT}"
AppRole Auth
AppRole is designed for machine-to-machine auth where Kubernetes auth is not available (CI systems, VMs, external services). It uses two credentials: a role_id (non-secret, like a username) and a secret_id (secret, like a password):
# Enable AppRole
vault auth enable approle
# Create role
vault write auth/approle/role/ci-pipeline \
secret_id_ttl="10m" \
token_num_uses="1" \
token_ttl="20m" \
token_max_ttl="30m" \
secret_id_num_uses="1" \
policies="ci-pipeline-policy"
# Fetch role_id (can be stored in CI config — it's not secret)
vault read auth/approle/role/ci-pipeline/role-id
# Generate a response-wrapped secret_id (the wrapper token expires in 120s)
# Even the delivery mechanism never sees the actual secret_id
vault write -wrap-ttl=120s -f auth/approle/role/ci-pipeline/secret-id
# CI system unwraps to get the actual secret_id
vault unwrap <wrapping-token>
# Login
vault write auth/approle/login \
role_id="<role-id>" \
secret_id="<secret-id>"
AWS IAM Auth
# Enable AWS auth
vault auth enable aws
# Configure AWS auth (Vault calls AWS STS to verify caller identity)
vault write auth/aws/config/client \
access_key="<aws-access-key>" \
secret_key="<aws-secret-key>" \
region="us-east-1"
# Bind an IAM role ARN to a Vault policy
vault write auth/aws/role/production-ec2 \
auth_type="iam" \
bound_iam_principal_arn="arn:aws:iam::123456789:role/production-app-role" \
policies="production-app-policy" \
ttl="1h"
Secret Engines
KV v2 — Versioned Key-Value
# Enable KV v2 at path "secret/"
vault secrets enable -path=secret kv-v2
# Write a secret
vault kv put secret/production/payment-service \
db_password="super-secret-password" \
api_key="stripe-live-key-abc123"
# Read a secret
vault kv get secret/production/payment-service
# Read only specific fields
vault kv get -field=db_password secret/production/payment-service
# Read a specific version
vault kv get -version=3 secret/production/payment-service
# List all secrets at a path
vault kv list secret/production/
# Soft-delete version (data recoverable)
vault kv delete secret/production/payment-service
# Destroy versions permanently
vault kv destroy -versions=1,2 secret/production/payment-service
# Check-and-set (CAS) — only write if current version matches
vault kv put -cas=4 secret/production/payment-service db_password="new-password"
# Read metadata (versions, deletion times, custom metadata)
vault kv metadata get secret/production/payment-service
Dynamic Secrets — PostgreSQL
Dynamic credentials are generated on-demand and expire automatically, eliminating long-lived database passwords:
# Enable database secret engine
vault secrets enable database
# Configure PostgreSQL connection
vault write database/config/production-postgres \
plugin_name="postgresql-database-plugin" \
allowed_roles="payment-service,order-service" \
connection_url="postgresql://{{username}}:{{password}}@postgres.production:5432/appdb?sslmode=require" \
username="vault-admin" \
password="vault-admin-password" \
password_authentication="scram-sha-256"
# Create a role (Vault will execute this SQL when generating credentials)
vault write database/roles/payment-service \
db_name="production-postgres" \
creation_statements="
CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}';
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA payments TO \"{{name}}\";
GRANT USAGE ON ALL SEQUENCES IN SCHEMA payments TO \"{{name}}\";
" \
default_ttl="1h" \
max_ttl="24h"
# Request credentials
vault read database/creds/payment-service
# Returns: username=v-kubernetes-payment-abc123, password=A1b2C3d4..., lease_duration=1h
# Revoke credentials early (e.g., on incident)
vault lease revoke database/creds/payment-service/<lease-id>
Dynamic Secrets — AWS
# Enable AWS secret engine
vault secrets enable aws
# Configure with credentials that can create/delete IAM users or assume roles
vault write aws/config/root \
access_key="AKIAIOSFODNN7EXAMPLE" \
secret_key="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" \
region="us-east-1"
# Role using IAM policy inline (creates a temp IAM user)
vault write aws/roles/s3-backup-writer \
credential_type="iam_user" \
policy_document='{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": ["s3:PutObject","s3:GetObject"],
"Resource": "arn:aws:s3:::my-backup-bucket/*"
}]
}'
# Role using assumed_role (preferred — no IAM user created)
vault write aws/roles/ec2-deployer \
credential_type="assumed_role" \
role_arns="arn:aws:iam::123456789:role/DeployerRole" \
default_ttl="15m" \
max_ttl="1h"
# Request temporary AWS credentials
vault read aws/creds/s3-backup-writer
Vault PKI Engine — Internal CA
Full CA Setup Flow
# Step 1: Enable PKI engine for root CA (long TTL — offline, rarely rotated)
vault secrets enable -path=pki pki
vault secrets tune -max-lease-ttl=87600h pki # 10 years
# Step 2: Generate root CA (keep the root key inside Vault — never export it)
vault write -field=certificate pki/root/generate/internal \
common_name="Acme Corp Root CA" \
organization="Acme Corp" \
country="US" \
ttl="87600h" \
key_type="rsa" \
key_bits=4096 \
> root-ca.crt
# Step 3: Configure CA URLs (used in issued certificates)
vault write pki/config/urls \
issuing_certificates="https://vault.internal:8200/v1/pki/ca" \
crl_distribution_points="https://vault.internal:8200/v1/pki/crl" \
ocsp_servers="https://vault.internal:8200/v1/pki/ocsp"
# Step 4: Enable intermediate CA engine
vault secrets enable -path=pki_int pki
vault secrets tune -max-lease-ttl=43800h pki_int # 5 years
# Step 5: Generate intermediate CSR
vault write -format=json pki_int/intermediate/generate/internal \
common_name="Acme Corp Intermediate CA" \
organization="Acme Corp" \
key_type="rsa" \
key_bits=4096 \
| jq -r '.data.csr' > intermediate.csr
# Step 6: Sign intermediate with root CA
vault write -format=json pki/root/sign-intermediate \
[email protected] \
common_name="Acme Corp Intermediate CA" \
ttl="43800h" \
| jq -r '.data.certificate' > intermediate.crt
# Step 7: Import signed intermediate into pki_int
vault write pki_int/intermediate/set-signed \
[email protected]
# Step 8: Configure intermediate CA URLs
vault write pki_int/config/urls \
issuing_certificates="https://vault.internal:8200/v1/pki_int/ca" \
crl_distribution_points="https://vault.internal:8200/v1/pki_int/crl" \
ocsp_servers="https://vault.internal:8200/v1/pki_int/ocsp"
# Step 9: Create a role for issuing leaf certificates
vault write pki_int/roles/internal-services \
allowed_domains="svc.cluster.local,internal,example.com" \
allow_subdomains=true \
allow_glob_domains=false \
max_ttl="720h" \ # 30 days max
key_type="rsa" \
key_bits=2048 \
require_cn=false \
allow_any_name=false
# Step 10: Issue a certificate
vault write pki_int/issue/internal-services \
common_name="payment-service.production.svc.cluster.local" \
alt_names="payment-service.production,payment-service" \
ttl="24h"
Vault Policies
Policies are written in HCL and attached to tokens via auth method roles. They follow a deny-by-default model.
# payment-service-policy.hcl — production read-only for its own secrets
path "secret/data/production/payment-service" {
capabilities = ["read"]
}
path "secret/data/production/payment-service/*" {
capabilities = ["read"]
}
# Allow reading dynamic database credentials
path "database/creds/payment-service" {
capabilities = ["read"]
}
# Allow issuing certificates for this service
path "pki_int/issue/internal-services" {
capabilities = ["create", "update"]
allowed_parameters = {
"common_name" = ["payment-service.production.svc.cluster.local"]
"ttl" = []
}
}
# Allow renewing own token
path "auth/token/renew-self" {
capabilities = ["update"]
}
# Allow looking up own token
path "auth/token/lookup-self" {
capabilities = ["read"]
}
# Write the policy to Vault
vault policy write payment-service payment-service-policy.hcl
# platform-admin-policy.hcl — read/write for platform team
path "secret/*" {
capabilities = ["create", "read", "update", "delete", "list"]
}
path "sys/mounts/*" {
capabilities = ["create", "read", "update", "delete", "list"]
}
path "auth/*" {
capabilities = ["create", "read", "update", "delete", "list", "sudo"]
}
path "pki*" {
capabilities = ["create", "read", "update", "delete", "list", "sudo"]
}
path "sys/policies/*" {
capabilities = ["create", "read", "update", "delete", "list"]
}
path "sys/audit*" {
capabilities = ["create", "read", "update", "delete", "list", "sudo"]
}
Vault Agent & Kubernetes Injector
Vault Agent Injector — Pod Annotations
The Vault Agent Injector mutating webhook intercepts pod creation and injects a Vault Agent init container and sidecar. Applications read secrets as files — no SDK changes needed.
apiVersion: apps/v1
kind: Deployment
metadata:
name: payment-service
namespace: production
spec:
template:
metadata:
annotations:
# Enable Vault Agent injection
vault.hashicorp.com/agent-inject: "true"
# Vault role for Kubernetes auth
vault.hashicorp.com/role: "payment-service"
# Inject KV secret at /vault/secrets/config.env
vault.hashicorp.com/agent-inject-secret-config.env: "secret/data/production/payment-service"
vault.hashicorp.com/agent-inject-template-config.env: |
{{- with secret "secret/data/production/payment-service" -}}
export DB_PASSWORD="{{ .Data.data.db_password }}"
export API_KEY="{{ .Data.data.api_key }}"
{{- end }}
# Inject database credentials
vault.hashicorp.com/agent-inject-secret-db-creds: "database/creds/payment-service"
vault.hashicorp.com/agent-inject-template-db-creds: |
{{- with secret "database/creds/payment-service" -}}
export DB_USER="{{ .Data.username }}"
export DB_PASS="{{ .Data.password }}"
{{- end }}
# Run as sidecar (not just init — keeps secrets refreshed)
vault.hashicorp.com/agent-run-as-same-user: "true"
vault.hashicorp.com/agent-pre-populate-only: "false"
# Resource limits for the Vault Agent sidecar
vault.hashicorp.com/agent-requests-cpu: "50m"
vault.hashicorp.com/agent-requests-mem: "64Mi"
vault.hashicorp.com/agent-limits-cpu: "100m"
vault.hashicorp.com/agent-limits-mem: "128Mi"
spec:
serviceAccountName: payment-service
containers:
- name: payment-service
image: acme/payment-service:v1.2.0
command: ["/bin/sh", "-c"]
args:
- |
source /vault/secrets/config.env
source /vault/secrets/db-creds
exec /app/payment-service
Vault Audit Logging
# Enable file audit device (JSON lines to stdout — captured by log aggregation)
vault audit enable file file_path=/vault/logs/audit.log
# Enable syslog audit
vault audit enable syslog tag="vault" facility="AUTH"
# Verify audit devices
vault audit list -detailed
# Sample audit log entry (all operations logged including auth failures)
# {
# "time": "2025-03-28T10:00:00Z",
# "type": "request",
# "auth": {"token_type": "service", "policies": ["payment-service-policy"]},
# "request": {"operation": "read", "path": "secret/data/production/payment-service"},
# "response": {"data": {"metadata": {"version": 4}}}
# }
cert-manager — Kubernetes Certificate Management
cert-manager automates the issuance and renewal of TLS certificates in Kubernetes. It supports Let's Encrypt, Vault PKI, self-signed, and any ACME-compatible CA.
Installation
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.14.0 \
--set installCRDs=true \
--set global.leaderElection.namespace=cert-manager \
--set prometheus.enabled=true \
--set prometheus.servicemonitor.enabled=true
# Verify
kubectl get pods -n cert-manager
kubectl get crds | grep cert-manager
ClusterIssuer — Let's Encrypt HTTP01
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: [email protected]
privateKeySecretRef:
name: letsencrypt-prod-account-key
solvers:
- http01:
ingress:
class: nginx # or "haproxy", "traefik", etc.
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging # use for testing — avoids rate limits
spec:
acme:
server: https://acme-staging-v02.api.letsencrypt.org/directory
email: [email protected]
privateKeySecretRef:
name: letsencrypt-staging-account-key
solvers:
- http01:
ingress:
class: nginx
ClusterIssuer — Let's Encrypt DNS01 (Route 53)
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-dns01
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: [email protected]
privateKeySecretRef:
name: letsencrypt-dns01-account-key
solvers:
- dns01:
route53:
region: us-east-1
hostedZoneID: Z1234567890ABC
# If using IAM role for service account (IRSA):
role: arn:aws:iam::123456789:role/cert-manager-route53
Certificate Resource
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: api-example-com
namespace: production
spec:
secretName: api-example-com-tls # K8s Secret created with tls.crt/tls.key/ca.crt
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
commonName: api.example.com
dnsNames:
- api.example.com
- api-v2.example.com
duration: 2160h # 90 days (Let's Encrypt maximum)
renewBefore: 720h # renew 30 days before expiry
privateKey:
algorithm: RSA
size: 2048
rotationPolicy: Always # generate new key on each renewal
Ingress Annotation for Automatic TLS
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
namespace: production
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
tls:
- hosts:
- api.example.com
secretName: api-example-com-tls # cert-manager creates this secret
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-service
port:
number: 80
VaultIssuer — Internal PKI via cert-manager
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: vault-internal-ca
spec:
vault:
server: https://vault.vault.svc.cluster.local:8200
path: pki_int/sign/internal-services
caBundle: <base64-encoded-vault-ca-cert>
auth:
kubernetes:
role: cert-manager
mountPath: /v1/auth/kubernetes
serviceAccountRef:
name: cert-manager
---
# Certificate issued by Vault internal CA
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: payment-service-mtls
namespace: production
spec:
secretName: payment-service-mtls-tls
issuerRef:
name: vault-internal-ca
kind: ClusterIssuer
commonName: payment-service.production.svc.cluster.local
dnsNames:
- payment-service
- payment-service.production
- payment-service.production.svc
- payment-service.production.svc.cluster.local
duration: 24h
renewBefore: 8h
openssl — Common Operations
# Generate a 4096-bit RSA private key
openssl genrsa -out server.key 4096
# Generate a Certificate Signing Request (CSR)
openssl req -new -key server.key -out server.csr \
-subj "/CN=api.example.com/O=Acme Corp/C=US" \
-addext "subjectAltName=DNS:api.example.com,DNS:api-v2.example.com"
# Self-sign a certificate (for testing only)
openssl x509 -req -in server.csr -signkey server.key \
-out server.crt -days 365 -sha256
# Sign a CSR with a CA
openssl x509 -req -in server.csr \
-CA ca.crt -CAkey ca.key -CAcreateserial \
-out server.crt -days 365 -sha256 \
-extfile san.ext # san.ext: subjectAltName=DNS:api.example.com
# Inspect a certificate (human-readable)
openssl x509 -in server.crt -noout -text
# Check certificate expiry
openssl x509 -in server.crt -noout -dates
openssl x509 -enddate -noout -in server.crt
# Check expiry of a live server certificate
echo | openssl s_client -servername api.example.com \
-connect api.example.com:443 2>/dev/null \
| openssl x509 -noout -dates
# Verify certificate chain
openssl verify -CAfile ca-chain.crt server.crt
# Convert PEM to PKCS12 (for Java keystores, browser import)
openssl pkcs12 -export -out server.p12 \
-inkey server.key -in server.crt -certfile ca-chain.crt
# Extract certificate from PKCS12
openssl pkcs12 -in server.p12 -nokeys -out server.crt
# Check if a key and certificate match (fingerprints must match)
openssl x509 -noout -modulus -in server.crt | openssl md5
openssl rsa -noout -modulus -in server.key | openssl md5
# Generate ECDSA key (P-256 — faster and smaller than RSA-2048)
openssl ecparam -name prime256v1 -genkey -noout -out ec.key
openssl req -new -key ec.key -out ec.csr -subj "/CN=api.example.com"
TLS Best Practices
Protocol Versions
Require TLS 1.2 minimum; prefer TLS 1.3. Disable SSL 3.0, TLS 1.0, TLS 1.1. NGINX example:
ssl_protocols TLSv1.2 TLSv1.3;
ssl_prefer_server_ciphers off; # let client choose from server list (TLS 1.3 negotiates this)
Cipher Suites
Prefer AEAD ciphers with forward secrecy. Avoid RC4, 3DES, NULL, EXPORT, ANON. Recommended NGINX config:
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305;
HSTS
HTTP Strict Transport Security prevents protocol downgrade attacks. Submit to the HSTS preload list for maximum protection:
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload" always;
OCSP Stapling
Staple the OCSP response to the TLS handshake so clients don't need to query the CA's OCSP server (privacy + performance):
ssl_stapling on;
ssl_stapling_verify on;
ssl_trusted_certificate /etc/nginx/ca-chain.crt;
resolver 8.8.8.8 8.8.4.4 valid=300s;
resolver_timeout 5s;
Certificate Monitoring
Expired certificates cause outages. Monitor expiry proactively with at least 30 days warning.
Prometheus — ssl_exporter
# Deploy ssl_exporter to scrape certificate expiry from live endpoints
# prometheus-ssl-exporter/values.yaml
config:
modules:
https:
prober: http
timeout: 10s
http:
valid_status_codes: []
method: GET
tls_config:
insecure_skip_verify: false
# Add targets to scrape config
scrape_configs:
- job_name: 'ssl-expiry'
metrics_path: /probe
params:
module: [https]
static_configs:
- targets:
- api.example.com:443
- admin.example.com:443
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: ssl-exporter:9219
Prometheus Alerting Rules
groups:
- name: certificate-expiry
rules:
# Alert 30 days before expiry
- alert: CertificateExpiringSoon
expr: ssl_cert_not_after - time() < 86400 * 30
for: 1h
labels:
severity: warning
annotations:
summary: "Certificate expiring in less than 30 days"
description: "Certificate for {{ $labels.instance }} expires in {{ $value | humanizeDuration }}"
# Alert 7 days before expiry — urgent
- alert: CertificateExpiringCritical
expr: ssl_cert_not_after - time() < 86400 * 7
for: 15m
labels:
severity: critical
annotations:
summary: "Certificate expiring in less than 7 days — action required"
description: "Certificate for {{ $labels.instance }} expires in {{ $value | humanizeDuration }}"
# Alert if cert-manager Certificate resource is not Ready
- alert: CertManagerCertificateNotReady
expr: certmanager_certificate_ready_status{condition="False"} == 1
for: 10m
labels:
severity: critical
annotations:
summary: "cert-manager Certificate not ready"
description: "Certificate {{ $labels.name }} in namespace {{ $labels.namespace }} is not ready"
# Alert on cert-manager renewal failure
- alert: CertManagerCertificateRenewalFailure
expr: increase(certmanager_certificate_renewal_timestamp_seconds[1h]) == 0
and certmanager_certificate_expiration_timestamp_seconds - time() < 86400 * 14
for: 1h
labels:
severity: warning
annotations:
summary: "cert-manager certificate has not been renewed"
description: "Certificate {{ $labels.name }} is due for renewal but has not been renewed in the last hour"
Useful Certificate Monitoring Commands
# List all cert-manager Certificates and their ready status
kubectl get certificates -A
# Describe a specific certificate (shows renewal events)
kubectl describe certificate api-example-com -n production
# List cert-manager CertificateRequests (one per issuance attempt)
kubectl get certificaterequests -A
# Watch cert-manager logs for ACME challenge activity
kubectl logs -n cert-manager -l app.kubernetes.io/name=cert-manager -f
# Check Vault PKI CRL and OCSP
vault read pki_int/crl/rotate # force CRL rotation
vault read pki_int/config/crl