Deploy ScaiWave
ScaiWave is built to deploy as a small set of services backed by standard infrastructure. The architecture is intentionally boring: nothing exotic, everything horizontally scalable.
What you run#
| Process | What it does | Scale |
|---|---|---|
scaiwave-api |
FastAPI / uvicorn | Horizontal; behind a load balancer. |
scaiwave-worker |
ARQ worker for background jobs | Horizontal; pick worker count by CPU. |
scaiwave-scheduler |
ARQ cron scheduler | Exactly one instance. Singleton. |
The client (SolidJS + Tauri) ships separately — desktop installers, mobile app stores, or a static web bundle behind a CDN.
What you depend on#
| Dependency | Purpose | Notes |
|---|---|---|
| MariaDB 11+ | Primary store | Or any MySQL-compatible engine; tested against MariaDB. |
| Redis 7+ | Cache, ARQ queue, presence | Persistent (AOF or RDB). |
| NATS JetStream 2.10+ | Event fan-out | Persistent stream for room events. |
| MinIO / S3 | Media storage | Bucket per tenant or one shared (use tenant_id prefix). |
| Weaviate 1.27+ | Search index | No server-side vectorizer needed (we vectorize via ScaiGrid). |
| ClamAV 1.4+ | Media virus-scan | Optional; recommended in prod. |
| LiveKit | Calls | Self-hosted or LiveKit Cloud. |
| ScaiKey | OIDC identity | Per-tenant config. |
| ScaiGrid | Inference | For models, embeddings, transcription. |
Environment#
Every config is via SCAIWAVE_<NAME> env vars. The essential set:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | |
Full list in Reference: Configuration.
Topology#
The API pods serve both HTTP and WebSocket. Load balancer should have:
- TLS termination.
- Sticky sessions for WS (or accept that occasional disconnects
drive reconnects via
/v1/sync). - Long-poll friendly timeouts (the sync endpoint blocks up to 30s by default).
Migration on deploy#
1 | |
Migrations are forward-only and additive. Roll out the new API pods after migrations succeed; old pods continue to work against the new schema until they restart.
Health and observability#
GET /health— readiness probe. Returns 200 when DB+Redis+NATS are reachable.GET /metrics— Prometheus scrape endpoint. Counters and histograms for request volume, latency, error rate, AI token usage, ARQ queue depth.
Structured logs via structlog. Configure log shipping to your usual sink (ELK, Loki, Cloud Logging, …).
Capacity rules of thumb#
| Component | Per | Limit |
|---|---|---|
| API pod | 1 vCPU, 1 GB | ~200 simultaneous WS, 300 req/s. |
| Worker pod | 1 vCPU, 1 GB | ~10 concurrent index jobs. |
| MariaDB | small instance | Handles thousands of rooms easily; scale read replicas if needed. |
| Redis | 2 GB | Plenty for a tenant with low thousands of active users. |
| Weaviate | 4 vCPU, 8 GB, SSD | For a tenant with 100K messages indexed. |
| MinIO | spinning disk | Match expected media volume; bandwidth often matters more than IOPS. |
Hardening checklist#
- HTTPS everywhere; HSTS on the load balancer.
- WSS, not WS.
- Bearer tokens in
Authorization, not query strings (the WS endpoint is the only exception; document this for proxy logs). - CSP on the served HTML (no inline scripts allowed by default).
- Trusted-proxy IP allowlist on the API.
- Per-tenant rate limits configured (
tenant.features.rate_limits). - Backups: DB nightly, Weaviate weekly, MinIO replicated.
- ClamAV reachable from the worker pods.
- LiveKit egress / recording bucket configured.
Operational tooling#
CLI commands the operator runs:
1 2 3 4 | |
See CLI reference for the full surface.
Where to go next#
- Reference: Configuration — every env var.
- Reference: CLI.
- Enable federation with a peer.