Platform
ScaiWave ScaiGrid ScaiCore ScaiBot ScaiDrive ScaiKey Models Tools & Services
Solutions
Organisations Developers Internet Service Providers Managed Service Providers AI-in-a-Box
Resources
Support Documentation Blog Downloads
Company
About Research Careers Investment Opportunities Contact
Log in

Models and Routing

ScaiGrid separates three concerns:

  • Frontend models — the stable public names your application asks for.
  • Backend models — the actual upstream deployments that answer requests.
  • Routing policies — the rules that map one to the other.

Your code only ever names a frontend model. Operators reshape what's behind that name — swapping providers, adjusting weights, configuring failover — without touching your code.

Frontend models#

A frontend model is an abstract identity. It has a slug (scailabs/poolnoodle-omni), a display name, a modality (chat, embedding, image, audio), capabilities, and optional defaults (context window, max output tokens, system prompt template, pricing). It may carry metadata (persona assignment, compliance flags).

Think of it as the menu item your application orders from. The kitchen decides how to prepare it.

List frontend models:

bash
1
curl -H "Authorization: Bearer $TOKEN" https://scaigrid.scailabs.ai/v1/models

Create one:

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
curl -X POST https://scaigrid.scailabs.ai/v1/models \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "slug": "tenant/acme/summarizer",
    "display_name": "Acme Summarizer",
    "modality": "chat",
    "context_window": 128000,
    "max_output_tokens": 4096
  }'

Slug conventions:

  • openai/gpt-4o — platform-level, OpenAI-provided
  • scailabs/poolnoodle-omni — platform-level, ScaiLabs-provided
  • partner/{partner_slug}/... — partner-scoped
  • tenant/{tenant_slug}/... — tenant-scoped

Backend models#

A backend model is a concrete, routable endpoint. It has a URI (openai:gpt-4o or scaiinfer:node-eu-01/llama-3.3), a provider type, credentials (stored encrypted), capabilities, and health status.

A backend is specific: it points at one provider, one deployment, one region. When ScaiGrid calls it, it knows exactly where the request goes.

List backends:

bash
1
curl -H "Authorization: Bearer $ADMIN_TOKEN" https://scaigrid.scailabs.ai/v1/backends

Register a backend:

bash
1
2
3
4
5
6
7
8
9
curl -X POST https://scaigrid.scailabs.ai/v1/backends \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "display_name": "OpenAI GPT-4o (us-east)",
    "uri": "openai:gpt-4o",
    "provider_type": "openai",
    "connection_config": {"api_key": "sk-..."}
  }'

Supported provider types: openai, anthropic, azure, mistral, qwen, google, scaiinfer (our distributed inference cluster), custom (any OpenAI-protocol-compatible endpoint).

Providers#

A provider is a group of backends sharing configuration — an OpenAI account, an Azure subscription, a ScaiInfer cluster. Providers have discovery endpoints (POST /v1/providers/{id}/discover) that list what models are available from that provider, so you can add backends without hand-typing identifiers.

bash
1
2
3
4
5
6
7
8
curl -X POST https://scaigrid.scailabs.ai/v1/providers \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Our OpenAI Account",
    "provider_type": "openai",
    "connection_config": {"api_key": "sk-..."}
  }'

Routing policies#

A routing policy decides, for each request to a frontend model, which backend to call. Policies are named and reusable — multiple frontend models can share the same policy.

Simplest case: a frontend model maps 1:1 to a backend. No policy needed — the mapping itself carries weight 100, priority 1.

Common multi-backend patterns:

Weighted round-robin. Two backends, 70/30 split.

json
1
2
3
4
5
6
{
  "mappings": [
    {"backend_id": "backend_a", "weight": 70, "priority": 1},
    {"backend_id": "backend_b", "weight": 30, "priority": 1}
  ]
}

Primary + failover. Use backend A unless it's unhealthy, then backend B.

json
1
2
3
4
5
6
{
  "mappings": [
    {"backend_id": "backend_a", "weight": 100, "priority": 1},
    {"backend_id": "backend_b", "weight": 100, "priority": 2}
  ]
}

Higher priority number = later fallback. Within a priority tier, traffic splits by weight.

Map a frontend to backends:

bash
1
2
3
4
5
6
7
8
9
curl -X POST https://scaigrid.scailabs.ai/v1/routing/mappings \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "frontend_id": "fm_poolnoodle_omni",
    "backend_id": "be_openai_gpt4o",
    "weight": 100,
    "priority": 1
  }'

Model access control#

Frontend models are listed per-tenant, but not every tenant should see every model. Use model access policies to scope visibility:

bash
1
2
3
4
5
6
7
8
9
curl -X POST https://scaigrid.scailabs.ai/v1/model-access \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "scope_type": "tenant",
    "scope_id": "tenant_acme",
    "model_slug": "openai/gpt-4o",
    "enabled": false
  }'

This disables openai/gpt-4o for tenant_acme. Without an explicit entry, models are implicitly allowed.

Model groups#

For grant-in-bulk scenarios, group models:

bash
1
2
3
4
5
6
7
curl -X POST https://scaigrid.scailabs.ai/v1/model-groups \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "GPT family",
    "members": ["openai/gpt-4o", "openai/gpt-4o-mini", "openai/gpt-4.1"]
  }'

Then grant or deny the whole group at once via /v1/model-access with model_group_id.

Health and circuit breaking#

Each backend has a health status: healthy, degraded, unhealthy, unavailable. ScaiGrid tracks failures per-backend and opens a circuit breaker after repeated errors. An unhealthy backend is skipped until its circuit closes (on successful probe requests).

Health checks: GET /v1/backends/{backend_id}/health.

Provider discovery#

Rather than typing out every model from a provider, ask ScaiGrid to discover them:

bash
1
2
curl -X POST https://scaigrid.scailabs.ai/v1/providers/{provider_id}/discover \
  -H "Authorization: Bearer $ADMIN_TOKEN"

Returns a list of available upstream models. You can selectively configure some as backends:

bash
1
2
curl -X POST https://scaigrid.scailabs.ai/v1/providers/{provider_id}/models/gpt-4o/configure \
  -H "Authorization: Bearer $ADMIN_TOKEN"

What's next#

Updated 2026-05-18 15:01:28 View source (.md) rev 17