Philosophy

ScaiGrid is shaped by a few opinions. Understanding them makes the API feel coherent instead of arbitrary.

The native API is the contract#

ScaiGrid exposes two HTTP surfaces:

/v1/ — the native ScaiGrid API. Rich, opinionated, designed for the ScaiLabs ecosystem.
/oai/v1/ — OpenAI-compatible. Minimal. Exists for drop-in migration.

The native API is the contract we evolve. When a new capability lands — structured tool results, per-model max_output_tokens, reasoning content, richer error taxonomies — it lands on /v1/ first, in a shape that fits the rest of the API. The OpenAI-compat layer stays minimal because its job is to match a spec we don't control.

If you're writing new code, use /v1/. If you're migrating a working OpenAI integration, use /oai/v1/ and only move piece-by-piece to /v1/ when you need a feature the compat layer doesn't expose.

Multi-tenancy is not optional#

Every ScaiGrid request runs inside a three-level tenancy: Partner → Tenant → User. You cannot opt out. Even a single-tenant deployment has exactly one partner and one tenant — the hierarchy is just collapsed.

This sounds heavy, but it's the reason you can:

Scope accounting and budgets at the right level without retrofitting.
Give different customers their own admin users without leaking data.
Let a partner reseller aggregate usage across the tenants they manage.
Enforce rate limits per tenant, per user, per API key — each independently.

If you're building something single-tenant, you get one implicit tenant and never think about it. If you're a platform, the tenancy is already there.

Models are addresses, not implementations#

A frontend model is a stable name your application asks for — scailabs/poolnoodle-omni, openai/gpt-4, claude-opus. A backend model is the actual upstream — an OpenAI deployment, an Anthropic endpoint, a local Ollama node. A routing policy maps frontend to backend with weights, priorities, and failover rules.

Your code never names a provider. It names a frontend model. An operator can retarget that model to a different provider, a different region, or a failover chain, without your code knowing. This separation is the whole point.

Modules, not monoliths#

ScaiGrid is a platform, not a product. The core gateway handles routing, auth, accounting, and webhooks. Everything domain-specific — chatbots, knowledge bases, agents, queues, sandboxes — ships as a module.

Modules get their own URL namespace under /v1/modules/{module_id}/, their own permissions, their own database tables (prefixed to avoid collision), their own admin UI pages, their own background tasks. They register themselves at startup and are discovered automatically.

You enable the modules you want. You pay only for what you use. New capabilities can ship as new modules without touching the core.

See Modules for the full framework.

Errors are structured#

Every ScaiGrid error response carries a machine-readable code, a human-readable message, and a request_id for traceability. Application code branches on code, not on string matching or HTTP status alone.

json
{
  "status": "error",
  "error": {"code": "BACKEND_RATE_LIMITED", "message": "...", "retry_after": 30},
  "meta": {"request_id": "..."}
}

The error code vocabulary is stable — we add new codes; we don't rename existing ones. See Errors for the full taxonomy.

Observability is a first-class concern#

Every request has a request_id. Every response carries it in a header. Every log line is tagged with it. You trace a slow or failing request through the gateway, through the upstream, through the module, through the accounting pipeline, end-to-end, without glue code.

Prometheus metrics are exposed at /metrics. Health checks live at /health. Audit logs record every mutating request. All of this is on by default, not an afterthought.

We prefer explicit over implicit#

Enable the modules you want. Nothing is auto-enabled per tenant.
Register the providers you want. No default cloud fallback that charges your credit card.
Set the budgets you want. No soft overage billing.
Define the routing you want. No magical load balancer that ignores your preferences.

The admin UI and API surface expose every knob. If the behavior isn't configurable, it's because there's a single correct answer — in which case we enforce it rather than let you break it.

Backwards compatibility is a promise#

Once an endpoint is published under /v1/, the request and response shape is stable. We add optional fields. We add new endpoints. We introduce /v2/ when we need to break something. We do not silently change semantics of an endpoint you're relying on.

See API Versioning for the full versioning policy.

What's next#

Architecture — how the pieces fit together.
Models and Routing — the frontend / backend / provider hierarchy in detail.