API reference
All endpoints are mounted at /v1/modules/scaibunker/ and authenticate with the standard ScaiGrid bearer token. Responses use ScaiGrid's standard envelope ({ "data": ... } for success, { "error": ... } for failures). List endpoints return { "data": [...], "has_more": bool, "next_cursor": "..." }.
Bunkers#
POST /bunkers#
Create a bunker. Requires scaibunker:create.
| Field | Required | Notes |
|---|---|---|
image |
yes | Image name; e.g. python-3.12. |
lifecycle_mode |
no | ephemeral (default), session, persistent. |
name |
no | Human-readable label. |
network_profile |
no | isolated (default), registry, allowlisted, unrestricted, transit. |
network_allowlist |
when allowlisted |
Hostnames or first-level wildcards. |
cpu_millicores |
no | Default 1000. |
memory_mb |
no | Default 512. |
disk_mb |
no | Default 1024. |
gpu_count, gpu_type |
no | Default 0 / null. |
max_lifetime_s, idle_timeout_s |
no | Lifecycle-dependent defaults. |
session_type, session_ref |
no | standalone (default), scaicore, scaiwave, api. |
label_set |
no | Arbitrary string→string map. |
env_vars |
no | Map injected into the bunker's environment. |
interfaces |
when transit |
List of {if_name, bridge_name, spoof_guard, mac}. |
bandwidth_mbit |
no | Per-bunker egress cap (1–10000). |
Returns the bunker record. Status starts at pending, advances through provisioning to running.
GET /bunkers#
List bunkers in the caller's tenant. Admins (scaibunker:admin) see across tenants. Query parameters: limit, cursor, status, lifecycle.
GET /bunkers/{bunker_id}#
Fetch one bunker.
DELETE /bunkers/{bunker_id}#
Terminate. Query parameter snapshot=true takes a final snapshot before destroy. Decrements quota counters across every bucket the bunker contributed to.
POST /bunkers/{bunker_id}/pause#
Suspend the microVM in place. State preserved; quota still held.
POST /bunkers/{bunker_id}/resume#
Resume a paused bunker.
POST /bunkers/{bunker_id}/snapshot#
Take a snapshot of the running bunker's rootfs. Returns the updated bunker record with snapshot_id populated.
Execution#
POST /bunkers/{bunker_id}/exec#
Run a command. Requires scaibunker:execute. Body:
| Field | Notes |
|---|---|
command |
The shell command to run. |
timeout_s |
Default 60. |
working_dir |
Default /workspace. |
env |
Optional environment overrides. |
stdin |
Optional string piped to the command's stdin. |
stream |
If true, response is Server-Sent Events. |
Inline mode returns { exec_id, exit_code, stdout, stderr, duration_ms, truncated, full_output_ref }. When truncated: true, the full output is in S3 at full_output_ref; fetch via the storage proxy.
Streaming events:
| Event | Payload |
|---|---|
stdout |
{ "text": "..." } |
stderr |
{ "text": "..." } |
exit |
{ "exit_code": 0, "duration_ms": 1234 } |
error |
{ "code": "EXEC_ERROR", "message": "..." } |
WebSocket /bunkers/{bunker_id}/shell#
Interactive PTY shell. Binary frames carry terminal I/O, JSON frames carry control messages (resize, heartbeat, close). Auth via token query parameter or Authorization: Bearer header. Requires scaibunker:shell.
Files#
All file endpoints require scaibunker:files and operate against running bunkers.
PUT /bunkers/{bunker_id}/files/{path:path}#
Write a file. Body is the raw bytes; Content-Type header is preserved.
GET /bunkers/{bunker_id}/files/{path:path}#
Read a file. Returns the raw bytes with a guessed media type.
Query parameter dir=true lists a directory instead, returning { entries: [{name, size, type, mtime, ...}] }.
DELETE /bunkers/{bunker_id}/files/{path:path}#
Remove a file.
POST /bunkers/{bunker_id}/files/upload#
Initiate a staged upload (for files larger than ~10 MB). Body { "filename": "..." }. Returns a pre-signed S3 URL and key; PUT the file directly to S3, then commit.
POST /bunkers/{bunker_id}/files/commit#
Commit a staged upload. Body { "key": "...", "dest_path": "/workspace/..." }. The worker pulls the staged object into the bunker.
Snapshots#
Require scaibunker:create.
GET /bunkers/{bunker_id}/snapshots#
List snapshots for one bunker, paginated.
GET /snapshots/{snapshot_id}#
Fetch one snapshot's metadata: bunker_id, storage_path, size_mb, checksum, trigger (manual, idle, terminate, checkpoint, failure), created_at, expires_at.
GET /snapshots/{snapshot_id}/archive#
Returns { "storage_path": "scaibunker/snapshots/...", "bucket": "..." }. Fetch the bytes from the storage proxy at GET /storage/snapshots/{bunker_id}/{name}.
GET /snapshots/{snapshot_id}/files/{path:path}#
Download one file from inside a snapshot without restoring the whole thing.
DELETE /snapshots/{snapshot_id}#
Delete a snapshot. Removes both the row and the S3 object.
Images#
GET /images#
List images visible to the caller (tenant-scoped + platform defaults). Requires scaibunker:create.
POST /images#
Register a new image. Requires scaibunker:images:manage. Body fields:
| Field | Notes |
|---|---|
name |
Unique within scope. |
build_source |
`{kind: "oci" |
size_mib |
32–65536. |
display_name, description |
Optional metadata. |
default_cpu_millicores / default_memory_mb / default_disk_mb |
Applied when not overridden on bunker create. |
gpu_required |
bool. |
preinstalled |
List of package names for documentation. |
entrypoint |
Default /bin/bash. |
scope |
tenant (default), partner, platform. |
lazy_pull |
If true, scheduler may place on cold workers. |
The response includes a warm summary if fan-out was triggered.
GET /images/{image_id}#
Fetch one image.
DELETE /images/{image_id}#
Deactivate. Soft-delete; existing bunkers keep running. Requires scaibunker:images:manage.
POST /images/{image_id}/scan#
Trigger a Trivy security re-scan. Requires scaibunker:images:manage.
POST /images/{image_id}/warm#
Force a fan-out warm to every targeting worker. Idempotent. Requires scaibunker:images:manage.
GET /images/{image_id}/cache#
Per-worker cache state for one image — used by the admin UI to show fan-out progress. Each row carries worker_id, status (pending, building, ready, failed, evicted), sha256, size_mib_actual, error, baked_at.
Workers#
Admin-only (scaibunker:admin or scaibunker:admin:platform), except POST /heartbeat which uses the worker shared bearer.
GET /workers#
List worker nodes.
GET /workers/{worker_id}#
Fetch one worker's full record.
GET /workers/{worker_id}/cache#
Heartbeat-derived cache ledger for one worker — what the controller thinks the worker has cached. Companion to GET /workers/{id}/images.
GET /workers/{worker_id}/images#
Live cache listing from the worker itself. Returns { items: [{name, sha256, size_mib, baked_at}], stale: bool, last_error: ... }. stale: true means the worker was unreachable; fall back to the ledger.
POST /workers/{worker_id}/drain#
Stop scheduling new bunkers to this worker. Active bunkers continue.
POST /workers/{worker_id}/resume#
Reverse a drain.
POST /heartbeat#
Worker-only. Authenticated by SCAIBUNKER_WORKER_TOKEN. Body matches the WorkerHeartbeat schema (capacity, status, active bunkers, optionally a cached_images reconciliation list).
Bridges#
Admin-only for create/delete (scaibunker:admin or scaibunker:admin:platform).
POST /bridges#
Create a tenant-scoped Linux bridge on a worker. Body { worker_id, bridge_name, mtu? }. Two-phase: row reserved, worker called, row rolled back if the worker rejects.
GET /bridges#
List bridges in the caller's tenant.
GET /bridges/{bridge_id}#
Fetch one bridge.
DELETE /bridges/{bridge_id}#
Delete. Worker is called first; row is removed locally on success.
Availability groups#
GET /availability-groups#
List groups visible to the caller (tenant-scoped + platform defaults). Requires scaibunker:create.
POST /availability-groups#
Create a group. Tenant admins create within their own tenant; super-admins create platform defaults. Body { name, description }.
GET /availability-groups/{group_id}#
Fetch group with worker_ids[] and image_ids[].
DELETE /availability-groups/{group_id}#
Delete. Requires scaibunker:admin:tenant for tenant groups, scaibunker:admin:platform for platform defaults.
POST /availability-groups/{group_id}/workers#
Add a worker. Body { ref: worker_id }. Triggers a fan-out warm of every image in the group onto the joining worker.
DELETE /availability-groups/{group_id}/workers/{worker_id}#
Remove a worker from the group.
POST /availability-groups/{group_id}/images#
Add an image. Body { ref: image_id }. Triggers a fan-out warm of the image to every worker in the group.
DELETE /availability-groups/{group_id}/images/{image_id}#
Remove an image.
Quota profiles#
GET /quota-profiles#
List visible profiles (own tenant + platform defaults).
POST /quota-profiles#
Create. Tenant admins create tenant-scoped; super-admins create platform defaults. Body fields: name, description, max_concurrent_bunkers, max_persistent_bunkers, max_cpu_millicores, max_memory_mb, max_disk_mb, max_gpu_count, and optional per_bunker_max_* ceilings.
GET /quota-profiles/{profile_id}#
Fetch one profile.
PATCH /quota-profiles/{profile_id}#
Partial update. Only the fields you send are touched.
DELETE /quota-profiles/{profile_id}#
Delete. Existing assignments cascade.
GET /quota-profiles/usage#
Per-profile bucket usage for the calling user: current usage, caps, and remaining headroom for every profile that applies to them.
GET /quota-profiles/{profile_id}/assignments#
List assignments for one profile.
POST /quota-profiles/{profile_id}/assignments#
Assign. Body { target_kind: "user"|"group", target_id, mode: "individual"|"shared"|"per_user" }. individual is user-only, shared/per_user are group-only.
DELETE /quota-profiles/{profile_id}/assignments/{assignment_id}#
Unassign.
Audit#
For bunkers running under unrestricted with worker-side audit on.
GET /bunkers/{bunker_id}/audit-batches#
List per-flow audit batches in S3. Query parameters: since_us (microsecond cutoff), limit (1–1000), continuation_token. Returns { items: [{key, size, last_modified, ts_us}], is_truncated, next_continuation_token }.
GET /bunkers/{bunker_id}/audit-batches/{batch_name}#
Fetch one batch's NDJSON contents.
Peers (worker-to-worker)#
GET /peers/{sha256}#
Worker-only (shared bearer + X-Worker-ID header). Returns the set of peer workers in the caller's availability groups that have this image sha cached, plus the controller storage URL as a fallback. Used by workers to fetch ext4 rootfs from a sibling rather than the controller's storage proxy.
Storage proxy#
The /storage/* prefix relays bytes between callers (workers or capability-token-bearing admins) and S3. Same routes are served by the dedicated SCAIGRID_MODE=bunker_proxy process when bytes-per-second through the proxy dominates controller CPU.
Authentication is one of: Authorization: Bearer <SCAIBUNKER_WORKER_TOKEN> (workers) or Authorization: Capability <jwt> (short-lived caps minted via POST /storage/capabilities).
| Method + path | Purpose |
|---|---|
PUT/GET /storage/images/{sha256}.ext4 |
Content-addressed ext4 image bytes. |
PUT/GET /storage/snapshots/{bunker_id}/{name} |
Per-bunker snapshot archives. |
PUT/GET /storage/audit/{bunker_id}/{name} |
Per-bunker NDJSON audit batches. |
PUT/GET /storage/output/{bunker_id}/{name} |
Large exec outputs. |
PUT/GET /storage/staging/{tenant_id}/{name} |
Admin-uploaded tar sources. |
POST /storage/capabilities |
Mint a short-lived capability token. Requires scaibunker:images:manage. |
GET requests support Range headers for resumable downloads.
Errors#
ScaiBunker-specific error codes:
| Code | Meaning |
|---|---|
BUNKER_NOT_FOUND |
Bunker id doesn't exist or isn't in your tenant. |
BUNKER_NOT_RUNNING |
Operation requires running state. |
BUNKER_INVALID_TRANSITION |
State machine refused (e.g. resume on terminated). |
BUNKER_QUOTA_EXCEEDED |
Caller's profile would be over a cap; the message names which one. |
BUNKER_PERMISSION_DENIED |
Caller lacks the required scaibunker permission. |
WORKER_NOT_FOUND |
Worker id doesn't exist. |
WORKER_UNAVAILABLE |
No worker can satisfy the resource request. |
IMAGE_NOT_FOUND |
Image id or name unknown. |
IMAGE_INACTIVE |
Image was deactivated. |
SNAPSHOT_NOT_FOUND |
Snapshot id unknown. |
NETWORK_PROFILE_DENIED |
Caller lacks the network-profile permission. |
LIFECYCLE_MODE_DENIED |
Caller lacks the lifecycle-mode permission. |
BRIDGE_NOT_FOUND |
Bridge id doesn't exist. |
BRIDGE_ALREADY_EXISTS |
Bridge name taken on that worker. |
BRIDGE_TENANT_MISMATCH |
Bridge belongs to a different tenant. |
INTERFACES_NOT_ALLOWED |
interfaces[] supplied for non-transit profile. |
TRANSIT_MISSING_INTERFACES |
network_profile=transit needs at least one interface. |
NO_SUITABLE_WORKER |
No worker has all the required image-cache state or bridges. |
QUOTA_PROFILE_NOT_FOUND |
Profile id unknown or not visible. |
QUOTA_ASSIGNMENT_CONFLICT |
Target already has a profile assignment. |
AUDIT_BATCH_NOT_FOUND |
Audit batch missing in S3. |
AUDIT_PATH_INVALID |
Bad audit batch name. |
STORAGE_UNAUTHORIZED |
Storage proxy auth failed. |
STORAGE_PATH_INVALID |
Storage path didn't match the allowed shape. |
STORAGE_NOT_FOUND |
Storage object missing. |
STORAGE_CAP_DISABLED |
Capability tokens not configured. |
WORKER_INVALID_REQUEST / WORKER_INVALID_STATE / WORKER_ALREADY_EXISTS / WORKER_OVERLOADED / WORKER_INTERNAL / WORKER_NOT_IMPLEMENTED / WORKER_TENANT_MISMATCH / WORKER_RESOURCE_NOT_FOUND / WORKER_UNAUTHORIZED |
Typed errors surfaced from the worker. |