Deployment

Running ScaiDrive in production. Container-based with external dependencies for persistence.

What you need#

Compute:

Three container roles off the same image: api, worker, migrate. Run migrate as a one-shot before startup; run api and worker long-lived.
2 CPU + 4 GB RAM per API replica as a starting point. Scale horizontally behind a load balancer. Workers are lighter — 1 CPU + 2 GB each.

External dependencies:

Dep	Role	Notes
MariaDB 10.11+ / MySQL 8.0+	Primary datastore	Galera cluster for HA
Redis 7+	Cache, queue, WebSocket pub/sub	Single instance OK for small deployments; Sentinel or Cluster for HA
S3-compatible store	File chunks, blobs	Garage, MinIO, AWS S3, GCS via S3 compat
ScaiKey	Identity provider	Separate deployment
ScaiSend	Transactional email	For invitations and notifications
Weaviate	Vector store (optional)	Required for semantic search

Docker Compose (dev / small prod)#

The repository ships a docker-compose.yml suitable for small deployments. Services:

yaml
services:
  scaidrive-api:
    image: scailabs/scaidrive:latest
    command: ["api"]
    environment:
      SCAIDRIVE_DATABASE_URL: "mysql+asyncmy://scaidrive:pw@mariadb:3306/scaidrive"
      SCAIDRIVE_REDIS_URL: "redis://redis:6379/0"
      SCAIDRIVE_S3_ENDPOINT: "http://garage:3900"
      SCAIDRIVE_S3_BUCKET: "scaidrive"
      SCAIDRIVE_SCAIKEY_URL: "https://scaikey.example.com"
      SCAIDRIVE_SCAIKEY_CLIENT_ID: "scaidrive"
      SCAIDRIVE_WEAVIATE_URL: "http://weaviate:8080"
    ports: ["8000:8000"]
    depends_on: [mariadb, redis, garage, weaviate]

  scaidrive-worker:
    image: scailabs/scaidrive:latest
    command: ["worker"]
    environment:
      # same as api
    depends_on: [mariadb, redis, garage]

  # ... mariadb, redis, garage, weaviate

Run migrations once on deploy:

bash
docker compose run --rm scaidrive-api migrate

Kubernetes#

The k8s/ directory in the repo has Helm charts and raw manifests. Key patterns:

API Deployment — replicas: 3 behind a Service + Ingress. Stateless; any replica can serve any request. Probes on /api/v1/health and /api/v1/ready.
Worker Deployment — replicas: 2. Stateless; workers pull from Redis-backed ARQ queues.
Migration Job — runs migrate once per deploy, blocks until success.
PodDisruptionBudget — minAvailable 1 on API and worker for rolling updates.
Redis — bitnami/redis or Redis Cluster for HA.
MariaDB — bitnami/mariadb-galera or an external managed DB.

HorizontalPodAutoscaler#

Scale the API on request rate or CPU:

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: scaidrive-api
spec:
  scaleTargetRef:
    kind: Deployment
    name: scaidrive-api
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target: {type: Utilization, averageUtilization: 70}

Workers scale on queue depth. The worker image exposes a Prometheus counter; use a KEDA scaler to drive replica count from it.

Configuration#

All runtime configuration is environment-based. Required variables:

Variable	Required	Notes
`SCAIDRIVE_DATABASE_URL`	Yes	Full async URL (`mysql+asyncmy://...`)
`SCAIDRIVE_REDIS_URL`	Yes	`redis://...` or `rediss://...`
`SCAIDRIVE_S3_ENDPOINT`	Yes	S3 / Garage / MinIO URL
`SCAIDRIVE_S3_BUCKET`	Yes
`SCAIDRIVE_S3_ACCESS_KEY`	Yes
`SCAIDRIVE_S3_SECRET_KEY`	Yes
`SCAIDRIVE_SCAIKEY_URL`	Yes	Base URL of ScaiKey
`SCAIDRIVE_SCAIKEY_CLIENT_ID`	Yes	Registered OAuth client
`SCAIDRIVE_JWT_ISSUER`	Yes	Must match `iss` in ScaiKey tokens
`SCAIDRIVE_WEAVIATE_URL`	No	Required for semantic search
`SCAIDRIVE_SCAISEND_URL`	No	Required for email notifications
`SCAIDRIVE_SECRET_KEY`	Yes	32+ bytes; used for internal encryption
`SCAIDRIVE_ENCRYPTION_KEY`	Yes	32 bytes; encrypts stored connector credentials
`SCAIDRIVE_CORS_ORIGINS`	No	Comma-separated, default `*` in dev, unset in prod
`SCAIDRIVE_LOG_LEVEL`	No	`DEBUG`, `INFO`, `WARNING`, `ERROR`

Full list: server/scaidrive/config/settings.py.

TLS#

ScaiDrive does not terminate TLS itself. Run behind a reverse proxy (nginx, Traefik, cloud load balancer) that terminates and forwards to http://scaidrive-api:8000.

WebSocket upgrades need the standard headers:

nginx
location /api/v1/realtime/ws {
  proxy_pass http://scaidrive-api;
  proxy_http_version 1.1;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header Connection "upgrade";
  proxy_read_timeout 3600s;
}

location /api/v1/sync/ws/ {
  proxy_pass http://scaidrive-api;
  proxy_http_version 1.1;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header Connection "upgrade";
  proxy_read_timeout 3600s;
}

Raise proxy_read_timeout to at least 1 hour; the default 60s kills WebSockets.

For large uploads, raise client_max_body_size:

nginx
client_max_body_size 5g;

File system layout on S3#

ScaiDrive stores:

chunks/{tenant_id}/{hash[0:2]}/{hash[2:4]}/{hash} — chunk blobs
uploads/{session_id}/chunks/{index} — pending resumable-upload chunks
avatars/{tenant_id}/{user_id} — user avatars

Content is never stored under user-identifying paths. A chunk used by a file owned by Alice is keyed only by hash — Alice's name does not appear in the path.

Lifecycle policies:

uploads/ prefix — set an S3 lifecycle rule to delete objects after 48 hours. ScaiDrive's own expiry cleanup is belt-and-braces, not strictly needed if the bucket has a lifecycle.
chunks/ prefix — never expire; ScaiDrive GCs these when reference counts reach zero.

Scaling#

API layer#

Stateless — scale linearly. Rate-limit counters are in Redis, shared across replicas. Sticky sessions are not required.

Worker layer#

Workers consume ARQ queues from Redis. Scaling up workers drains queues faster. Separate queues for:

high — vectorization, DLP inline
medium — connector sync
low — quota recompute, retention sweeps

A small deployment can run one worker per queue; a larger deployment runs N workers with queue affinity.

Database#

The MariaDB schema is normalized with per-tenant composite indexes. For large tenants (>10M files), consider read-replicas; the ORM has a use_read_replica hint for list-heavy endpoints.

Storage#

S3-compatible stores scale independently. For on-prem, Garage (tiered, distributed) is the most commonly deployed choice alongside ScaiDrive.

Backups#

MariaDB: Standard mariabackup or managed-DB point-in-time recovery. ScaiDrive expects a consistent snapshot — a hot backup tool is required for zero downtime.

S3: Versioning on the bucket plus replication to a second region. Chunks are immutable once written; no special consideration needed.

Redis: Ephemeral. Don't bother backing up. On Redis loss, rate-limit counters reset and queue state is lost — workers retry jobs.