---
title: Deployment
path: operations/deployment
status: published
---

# Deployment

How to deploy ScaiGrid — managed, self-hosted, or hybrid. For the architecture overview, see [Architecture](../01-introduction/03-architecture.md).

## Options

- **Managed.** ScaiLabs hosts ScaiGrid for you at `scaigrid.scailabs.ai`. You sign up, get a tenant, start making API calls. No operational burden.
- **Self-hosted.** Run your own ScaiGrid instance on your infrastructure. Full control, more responsibility.
- **Hybrid.** Use the managed gateway for most inference but deploy your own ScaiInfer / ScaiBunker / ScaiMind workers to keep sensitive workloads in your infrastructure.

## Components to deploy

A complete ScaiGrid deployment needs:

- **ScaiGrid process** — 4 runtime modes (HTTP, gRPC, worker, migrate)
- **MariaDB Galera** — cluster of 3+ nodes recommended for HA
- **Redis** — single node is fine; HA via Sentinel or Redis Cluster for production
- **S3-compatible storage** — Garage (self-hosted), MinIO, or AWS S3
- **ScaiKey** — identity provider for authentication
- **Nginx** — TLS termination and reverse proxy

Optional depending on modules:

- **Weaviate** — for ScaiMatrix vector search
- **Neo4j** — for ScaiMatrix graph knowledge
- **ScaiInfer cluster** — for self-hosted LLM inference
- **ScaiBunker workers** — for sandboxed code execution
- **ScaiMind cluster** — for training workloads

## Runtime modes

The ScaiGrid container image supports five modes via the `SCAIGRID_MODE` env var:

| Mode | Purpose |
|------|---------|
| `migrate` | Runs Alembic migrations (core + all enabled modules), exits |
| `http` | FastAPI HTTP server on port 8000 — main API |
| `grpc` | gRPC server on port 50051 — internal integrations |
| `worker` | ARQ background worker — cron tasks from core and modules |
| `bunker_proxy` | Stripped-down ScaiBunker storage proxy on port 8001 — DB-free, horizontally scalable behind a load balancer |

Production deployments run `http`, `grpc`, and `worker` as separate container replicas. `migrate` is a one-shot init container or manual step before rolling out a new version.

`bunker_proxy` is a specialized mode for deployments that want to put ScaiBunker workers behind a centralized storage proxy (rather than letting workers talk directly to S3). It loads only the storage routes — no DB, no module registry, no auth on the workers. Deploy multiple replicas behind an LB if you need horizontal scaling for image-bytes throughput. See [ScaiBunker → Image workflow](/docs/scaigrid/scaibunker#images) for when this matters; today's worker is a direct S3 client by default, so `bunker_proxy` is opt-in infrastructure.

## Minimal docker-compose

```yaml
services:
  scaigrid-migrate:
    image: registry.scailabs.ai/scaigrid:latest
    environment:
      SCAIGRID_MODE: migrate
    env_file: .env
    depends_on:
      mariadb: {condition: service_healthy}

  scaigrid-http:
    image: registry.scailabs.ai/scaigrid:latest
    environment:
      SCAIGRID_MODE: http
    env_file: .env
    ports: ["8000:8000"]
    depends_on:
      scaigrid-migrate: {condition: service_completed_successfully}

  scaigrid-grpc:
    image: registry.scailabs.ai/scaigrid:latest
    environment:
      SCAIGRID_MODE: grpc
    env_file: .env
    ports: ["50051:50051"]

  scaigrid-worker:
    image: registry.scailabs.ai/scaigrid:latest
    environment:
      SCAIGRID_MODE: worker
    env_file: .env
```

Plus your MariaDB, Redis, S3, ScaiKey. For a complete stack, see `docker-compose.yml` in the ScaiGrid source tree.

## Essential env vars

```
# Database
DATABASE_URL=mysql+asyncmy://scaigrid:CHANGE_ME@mariadb:3306/scaigrid

# Redis
REDIS_URL=redis://redis:6379/0

# S3 / object storage
S3_ENDPOINT_URL=http://garage:3900
S3_ACCESS_KEY_ID=...
S3_SECRET_ACCESS_KEY=...
S3_BUCKET=scaigrid

# ScaiKey identity provider
SCAIKEY_BASE_URL=https://scaikey.scailabs.ai
SCAIKEY_JWKS_URL=https://scaikey.scailabs.ai/api/v1/platform/.well-known/jwks.json
SCAIKEY_AUDIENCE=user-portal
SCAIKEY_CLIENT_ID=...
SCAIKEY_CLIENT_SECRET=...
SCAIKEY_WEBHOOK_SECRET=...

# JWT (for admin UI sessions)
JWT_SIGNING_SECRET=<random 64 chars>

# Defaults
APP_ENV=production
APP_LOG_LEVEL=info
```

For local development, `SCAIGRID_AUTH_DISABLED=true` bypasses auth entirely — never set this in production.

Full env var reference in the ScaiGrid source `app/config.py`.

## Nginx in front

Terminate TLS and route. Core locations:

```nginx
upstream scaigrid_backend {
    server scaigrid:8000;
    keepalive 32;
}

server {
    listen 443 ssl http2;
    server_name scaigrid.scailabs.ai;

    ssl_certificate /etc/letsencrypt/live/scaigrid.scailabs.ai/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/scaigrid.scailabs.ai/privkey.pem;

    location /v1/ {
        proxy_pass http://scaigrid_backend;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_buffering off;
        proxy_read_timeout 660s;  # must exceed dispatch_stream_timeout_s (600s)
    }

    location /oai/ {
        proxy_pass http://scaigrid_backend;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_buffering off;
        proxy_read_timeout 660s;
    }

    location /health {
        proxy_pass http://scaigrid_backend;
    }
}
```

`proxy_read_timeout 660s` is critical — SSE streams can last longer than the default 60 seconds, and cutting them short produces mysterious client hangs. Must be greater than ScaiGrid's `DISPATCH_STREAM_TIMEOUT_S` (default 600).

## HTTPS / TLS

All production traffic should be HTTPS. Use Let's Encrypt via certbot, or your corporate PKI. ScaiGrid itself speaks HTTP — TLS termination is nginx's job.

## Scaling

- **HTTP replicas.** Horizontally scale `scaigrid-http` behind a load balancer. Stateless; no sticky sessions needed.
- **Worker replicas.** ARQ workers coordinate via Redis; scale out for throughput.
- **gRPC replicas.** Scale with HTTP proportionally if you use internal gRPC-heavy integrations.
- **Database.** MariaDB Galera cluster for HA. Size based on query load — most workloads are Redis-bound not DB-bound.
- **Redis.** A single well-provisioned Redis handles surprisingly high throughput. Move to Cluster mode only if you saturate a single instance (typically > 100k ops/sec).

## Upgrading

1. Build/pull the new image.
2. Run `scaigrid-migrate` container with the new image — applies any new migrations.
3. Rolling replace `http`, `grpc`, `worker` replicas.
4. Watch logs for module initialization errors.

ScaiGrid is designed so old HTTP replicas can serve traffic while new replicas are rolling. Migrations are designed to be backwards-compatible within a minor version.

## Module migration

Each module has its own Alembic migration stream (table `alembic_version_mod_{module_id}`). The `migrate` mode runs core migrations first, then each enabled module's migrations. Failures in one module don't block other modules from migrating.

## Related

- [Health and Monitoring](./02-health-and-monitoring.md)
- [Troubleshooting](./03-troubleshooting.md)
- [Architecture](../01-introduction/03-architecture.md)
