Build an mTLS Service Mesh
Stand up internal mTLS between services using a ScaiVault-issued CA hierarchy. By the end you'll have a root CA, an intermediate, a PKI role that constrains issuance, and certificates being issued to running services with auto-renewal.
This tutorial assumes you control the services and can ship them new certificates without downtime — typical for Kubernetes, Nomad, or any container orchestrator.
What you need
- ScaiVault token with
pki:admin and pki:issue.
- One or more services to give certificates to. We'll use
billing and reporting running on Kubernetes as the examples.
- 30 minutes.
What we're building
graph TB
Root[acme-root CA<br/>10-year, offline after setup]
Int[acme-mtls-intermediate CA<br/>5-year, online issuer]
Role[Role: svc-mtls<br/>allowed_domains: *.svc.cluster.local, *.internal<br/>max_ttl: 720h]
Billing[billing.svc.cluster.local<br/>7-day cert, auto-renewed]
Reporting[reporting.svc.cluster.local<br/>7-day cert, auto-renewed]
Client[test-client.svc.cluster.local<br/>client cert for verification]
Root --> Int
Int --> Role
Role --> Billing
Role --> Reporting
Role --> Client
1. Create the CA hierarchy
The root CA is long-lived (10 years), high-trust, and should not sign leaf certs directly. The intermediate is the working issuer.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33 | # Root — generate, export, then in practice take it offline
curl -X POST https://scaivault.scailabs.ai/v1/pki/ca \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "acme-root",
"common_name": "Acme Root CA",
"ca_type": "root",
"key_type": "ec",
"key_size": 256,
"validity_days": 3650
}'
# -> {"id": "ca_root_abc", "valid_until": "2036-...", ...}
# Save the root cert PEM — distribute this as your trust anchor
curl -H "Authorization: Bearer $TOKEN" \
https://scaivault.scailabs.ai/v1/pki/ca/ca_root_abc/certificate \
| jq -r '.certificate_pem' > acme-root.pem
# Intermediate — this one signs leaf certs
curl -X POST https://scaivault.scailabs.ai/v1/pki/ca \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "acme-mtls-intermediate",
"common_name": "Acme mTLS Intermediate",
"ca_type": "intermediate",
"parent_ca_id": "ca_root_abc",
"key_type": "ec",
"key_size": 256,
"validity_days": 1825
}'
# -> {"id": "ca_intermediate_mtls", ...}
|
For production: revoke the root's access from automation after this step. Sign new intermediates by hand when the existing one needs replacement (every 3-4 years). The intermediate is the only thing your automation should be able to touch.
2. Define a PKI role
The role constrains what the intermediate will sign:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 | curl -X POST https://scaivault.scailabs.ai/v1/pki/roles \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "svc-mtls",
"ca_id": "ca_intermediate_mtls",
"allowed_domains": ["*.svc.cluster.local", "*.acme.internal"],
"allow_subdomains": true,
"allow_ip_sans": true,
"max_ttl": "720h",
"key_type": "ec",
"key_bits": 256,
"require_cn": true,
"client_flag": true,
"server_flag": true
}'
|
client_flag + server_flag means certs from this role can be used for both client and server TLS — necessary for mTLS where both sides authenticate.
3. Grant your services issue access
You don't want every service to have an admin token. Create a service account that can only issue certs from this one role, and bind a narrow policy to it.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20 | # Create the service account in ScaiKey (or use one that exists)
SA_ID="sa:cert-issuer-billing"
# Policy
curl -X POST https://scaivault.scailabs.ai/v1/policies \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"name\": \"issue-from-svc-mtls\",
\"rules\": [{
\"path_pattern\": \"pki/issue/svc-mtls\",
\"permissions\": [\"read\"]
}]
}"
# -> {"id": "pol_issue_svc"}
curl -X POST https://scaivault.scailabs.ai/v1/policies/pol_issue_svc/bindings \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d "{\"identity_type\": \"service_account\", \"identity_id\": \"$SA_ID\"}"
|
A token for this service account can issue any cert under *.svc.cluster.local or *.acme.internal, with TTL up to 720h, key type EC P-256. Nothing else.
4. Issue a cert at service startup
Your service, on boot, calls ScaiVault and gets a cert. The simplest pattern is a small init script or container that runs before the main process.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23 | #!/bin/sh
# init-cert.sh — fetches a cert at startup
SVC_NAME="${SVC_NAME:?required}" # e.g. billing
CLUSTER="${CLUSTER:-svc.cluster.local}"
resp=$(curl -fsS -X POST \
-H "Authorization: Bearer $SCAIVAULT_TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"common_name\": \"${SVC_NAME}.${CLUSTER}\",
\"alt_names\": [\"${SVC_NAME}-api.${CLUSTER}\"],
\"ttl\": \"168h\"
}" \
"https://scaivault.scailabs.ai/v1/pki/issue/svc-mtls")
mkdir -p /run/certs
echo "$resp" | jq -r '.certificate' > /run/certs/tls.crt
echo "$resp" | jq -r '.private_key' > /run/certs/tls.key
echo "$resp" | jq -r '.ca_chain[]' > /run/certs/ca.crt
chmod 600 /run/certs/tls.key
echo "Cert valid until: $(echo "$resp" | jq -r '.not_after')"
|
On Kubernetes, run this in an init container that shares an emptyDir volume with the main container:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24 | spec:
initContainers:
- name: fetch-cert
image: alpine:3
command: ["/bin/sh", "/scripts/init-cert.sh"]
env:
- name: SVC_NAME
value: billing
- name: SCAIVAULT_TOKEN
valueFrom:
secretKeyRef:
name: scaivault-issuer-token
key: token
volumeMounts:
- {name: certs, mountPath: /run/certs}
- {name: scripts, mountPath: /scripts}
containers:
- name: app
image: acme/billing:1.0
volumeMounts:
- {name: certs, mountPath: /run/certs, readOnly: true}
volumes:
- {name: certs, emptyDir: {medium: Memory}}
- {name: scripts, configMap: {name: cert-init-script}}
|
emptyDir.medium: Memory keeps the private key out of disk.
5. Set up renewal
7-day cert TTLs are short enough that renewal needs to be automatic. Three options, increasing in sophistication:
Option A — restart pods periodically. Add a CronJob that bounces the deployment every 5 days. The init container fetches a fresh cert on each pod start. Brutal but simple.
Option B — sidecar refresher. A sidecar container watches the cert's not_after and re-issues when within 24h. Hot-reload the main process (signal or socket dance) when the file changes.
Option C — ScaiVault-driven via webhook. Subscribe to certificate.expiring events for this CA and trigger a controlled rolling restart.
Option B is the most common production pattern. A minimal sidecar:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32 | import os, time, subprocess, json
import httpx
def days_until_expiry(cert_path):
out = subprocess.check_output(["openssl", "x509", "-in", cert_path, "-enddate", "-noout"])
end = out.decode().strip().removeprefix("notAfter=")
import datetime as dt
expiry = dt.datetime.strptime(end, "%b %d %H:%M:%S %Y %Z")
return (expiry - dt.datetime.utcnow()).days
def fetch_cert():
r = httpx.post(
"https://scaivault.scailabs.ai/v1/pki/issue/svc-mtls",
headers={"Authorization": f"Bearer {os.environ['SCAIVAULT_TOKEN']}"},
json={
"common_name": f"{os.environ['SVC_NAME']}.svc.cluster.local",
"ttl": "168h",
},
)
r.raise_for_status()
data = r.json()
open("/run/certs/tls.crt", "w").write(data["certificate"])
open("/run/certs/tls.key", "w").write(data["private_key"])
open("/run/certs/ca.crt", "w").write("\n".join(data["ca_chain"]))
# Signal the main process to reload
subprocess.run(["pkill", "-HUP", "-f", "billing"], check=False)
while True:
if days_until_expiry("/run/certs/tls.crt") < 2:
print("Renewing cert")
fetch_cert()
time.sleep(3600)
|
6. Validate mTLS works
From inside the cluster, hit the service with a client cert from the same CA:
1
2
3
4
5
6
7
8
9
10
11
12
13 | # Issue a client cert for testing
curl -X POST https://scaivault.scailabs.ai/v1/pki/issue/svc-mtls \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"common_name": "test-client.svc.cluster.local", "ttl": "1h"}' \
| tee /tmp/client.json
jq -r '.certificate' /tmp/client.json > /tmp/client.crt
jq -r '.private_key' /tmp/client.json > /tmp/client.key
jq -r '.ca_chain[]' /tmp/client.json > /tmp/ca.crt
curl --cert /tmp/client.crt --key /tmp/client.key --cacert /tmp/ca.crt \
https://billing.svc.cluster.local/health
|
If you get a TLS handshake error, check that the service has the CA chain (/run/certs/ca.crt) configured as its client-CA trust file. Most TLS stacks have a separate config for "what server cert do I present" vs "what client CAs do I trust" — both need to point at the CA chain.
7. CRLs and revocation
For high-stakes services, configure the CRL distribution point:
| # Get the CRL URL for the intermediate
curl -H "Authorization: Bearer $TOKEN" \
https://scaivault.scailabs.ai/v1/pki/ca/ca_intermediate_mtls \
| jq -r '.crl_url'
# -> https://scaivault.scailabs.ai/v1/pki/ca/ca_intermediate_mtls/crl
|
Configure your TLS stack to fetch and respect this CRL. With cert TTLs at 7 days, revocation is often more useful as an audit signal than an enforcement mechanism — but it's there if you need it.
To revoke:
| curl -X POST https://scaivault.scailabs.ai/v1/pki/revoke \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"serial_number": "1A:2B:...", "reason": "key_compromise"}'
# Force CRL regeneration so the change propagates faster
curl -X POST https://scaivault.scailabs.ai/v1/pki/ca/ca_intermediate_mtls/crl \
-H "Authorization: Bearer $TOKEN"
|
What you have now
- A root CA, offline-able after this setup.
- An intermediate CA driving day-to-day issuance.
- A PKI role that prevents the intermediate from signing anything you didn't intend.
- Services that fetch fresh certs on start and renew before expiry.
- A revocation path with a CRL distribution endpoint.
- An audit log of every issued cert (
GET /v1/audit/logs?action=pki_issue).
What's next