Models and Routing Reference

Frontend models, backend models, providers, routing policies, model access, and model groups. For concepts, see Models and Routing.

Frontend models#

GET /v1/models#

List frontend models visible to the caller's tenant.

Query params: limit, cursor, modality (chat | embedding | image | audio), status (active | deprecated).

Response includes health rollup:

json
{
  "data": {
    "items": [
      {
        "slug": "scailabs/poolnoodle-omni",
        "display_name": "Poolnoodle Omni",
        "modality": "chat",
        "context_window": 256000,
        "max_output_tokens": 32768,
        "status": "active",
        "health_status": "healthy",
        "active_backend_count": 2,
        "total_backend_count": 2
      }
    ]
  }
}

GET /v1/models/{slug}#

Get a frontend model by slug. URL slugs with / need %2F encoding: /v1/models/tenant%2Facme%2Fsummarizer.

POST /v1/models#

Create a frontend model. Requires models:manage.

json
{
  "slug": "tenant/acme/summarizer",
  "display_name": "Acme Summarizer",
  "description": "Specialized summarization model for Acme content",
  "modality": "chat",
  "context_window": 128000,
  "max_output_tokens": 4096,
  "input_price_per_mtok": "2.50",
  "output_price_per_mtok": "10.00",
  "default_params": {"temperature": 0.3},
  "status": "active"
}

PUT /v1/models/{model_id}#

Update a frontend model.

DELETE /v1/models/{model_id}#

Delete a frontend model. Fails with 409 if it still has active backend mappings — remove those first.

POST /v1/models/{model_id}/avatar#

Upload an avatar image (multipart, file field). Appears in the admin UI and OAI /models response.

GET /v1/models/{model_id}/avatar#

Fetch the avatar. Public — no auth required (it's referenced from chat UIs).

Backend models#

GET /v1/backends#

List backends. Requires models:manage.

POST /v1/backends#

json
{
  "display_name": "OpenAI GPT-4o (us-east)",
  "uri": "openai:gpt-4o",
  "provider_type": "openai",
  "connection_config": {"api_key": "sk-..."},
  "cost_input_per_mtok": "2.50",
  "cost_output_per_mtok": "10.00",
  "capabilities": ["chat", "audio_input"]
}

connection_config is encrypted at rest. Provider types: openai, anthropic, azure, mistral, qwen, google, scaiinfer, custom.

GET /v1/backends/{backend_id}#

Get backend details.

PUT /v1/backends/{backend_id}#

Update backend config.

DELETE /v1/backends/{backend_id}#

Remove a backend. Cascades to delete its routing mappings.

GET /v1/backends/{backend_id}/health#

Probe backend health. Returns healthy / degraded / unhealthy / unavailable with diagnostic details.

Providers#

GET /v1/providers#

List configured providers.

POST /v1/providers#

Add a provider.

json
{
  "name": "Our OpenAI Account",
  "provider_type": "openai",
  "connection_config": {"api_key": "sk-..."}
}

GET /v1/providers/{provider_id}#

Get provider details.

PUT /v1/providers/{provider_id}#

Update provider config.

DELETE /v1/providers/{provider_id}#

Delete a provider. Fails if backends still reference it.

POST /v1/providers/{provider_id}/discover#

Query the provider for available models. Returns a list of discovered models with context_window/max_output_tokens populated from the provider's API or the internal catalog.

GET /v1/providers/{provider_id}/models#

List discovered models (cached).

POST /v1/providers/{provider_id}/models/{model_id}/configure#

Configure a discovered model as a backend in one call — shortcut for POST /v1/backends with the right URI and provider_type.

Routing#

GET /v1/routing/policies#

List routing policies.

POST /v1/routing/policies#

Create a routing policy.

json
{
  "name": "GPT fallback chain",
  "type": "weighted",
  "config": {"retry_on_circuit_open": true}
}

GET /v1/routing/policies/{policy_id}#

Get policy.

PUT /v1/routing/policies/{policy_id}#

Update policy.

DELETE /v1/routing/policies/{policy_id}#

Delete policy. Frontend models using it fall back to the first-mapped backend.

GET /v1/routing/mappings/{frontend_id}#

List backend mappings for a frontend model.

POST /v1/routing/mappings#

Create a mapping.

json
{
  "frontend_id": "fm_poolnoodle_omni",
  "backend_id": "be_openai_gpt4o",
  "weight": 100,
  "priority": 1,
  "status": "active"
}

PUT /v1/routing/mappings/{frontend_id}/{backend_id}#

Update a mapping's weight, priority, or status.

DELETE /v1/routing/mappings/{frontend_id}/{backend_id}#

Remove a mapping.

Model access#

GET /v1/model-access#

List access policies for the tenant.

POST /v1/model-access#

Create a policy.

json
{
  "scope_type": "tenant",
  "scope_id": "tenant_acme",
  "model_slug": "openai/gpt-4o",
  "enabled": false
}

Or group-based:

json
{
  "scope_type": "tenant",
  "scope_id": "tenant_acme",
  "model_group_id": "grp_premium_models",
  "enabled": true,
  "rate_limit": 100
}

GET /v1/model-access/{access_id}#