Platform
ScaiWave ScaiGrid ScaiCore ScaiBot ScaiDrive ScaiKey Models Tools & Services
Solutions
Organisations Developers Internet Service Providers Managed Service Providers AI-in-a-Box
Resources
Support Documentation Blog Downloads
Company
About Research Careers Investment Opportunities Contact
Log in

Errors

ScaiGrid error responses are structured and stable. Your code branches on code, never on HTTP status alone or string matching.

Error envelope#

Every error response has the same shape:

json
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "status": "error",
  "error": {
    "code": "BACKEND_RATE_LIMITED",
    "message": "Backend rate limit exceeded — please retry later",
    "retry_after": 30
  },
  "meta": {
    "request_id": "req_abc123"
  }
}
  • status — always "error" for error responses.
  • error.code — machine-readable code from a stable vocabulary. Branch on this.
  • error.message — human-readable description. Display or log as-is; don't parse.
  • error.retry_after — optional, present on rate-limit and some timeout errors. Seconds to wait.
  • error.details — optional, present on validation errors. Array of {field, message} for each problem.
  • meta.request_id — unique request ID. Include in any support ticket.

The HTTP status code in the response line matches the error class (400-level for client errors, 500-level for server/gateway errors), but the code field is more specific and should drive your handling.

Streaming error frames#

For streaming endpoints (SSE), mid-stream errors arrive as a distinct event:

tsql
1
2
3
4
event: error
data: {"code": "BACKEND_ERROR", "message": "..."}

data: [DONE]

Listen for the error event type in your SSE client, not just data events. The stream always ends with data: [DONE] whether it succeeded or failed.

Retry classification#

Code Retry? Notes
AUTH_TOKEN_INVALID No Fix credentials
AUTH_TOKEN_MISSING No Add Authorization header
AUTHZ_PERMISSION_DENIED No User lacks permission
VALIDATION_ERROR No Malformed request body
MODEL_NOT_FOUND No Model slug doesn't exist
MODEL_ACCESS_DENIED No Model not enabled for tenant
MODEL_UNAVAILABLE Sometimes All backends are unhealthy — retry after a delay
BACKEND_RATE_LIMITED Yes Honor retry_after
BACKEND_TIMEOUT Yes Backoff exponentially
BACKEND_ERROR Once Upstream transient failure
UPSTREAM_SHAPE_MISMATCH No Gateway integration bug — file a support ticket
BUDGET_EXCEEDED No Wait for budget period to roll over
RATE_LIMITED Yes Per-key/user/tenant limit — honor retry_after
QUOTA_EXCEEDED No Hard quota reached
SERVICE_DRAINING Yes Gateway is rolling — retry shortly

Canonical error codes#

Authentication / authorization

  • AUTH_TOKEN_MISSING (401) — no Authorization header
  • AUTH_TOKEN_INVALID (401) — token couldn't be validated
  • AUTH_INSUFFICIENT_SCOPE (403) — token lacks required scope
  • AUTHZ_PERMISSION_DENIED (403) — user lacks permission
  • SESSION_EXPIRED (401) — refresh your token

Validation

  • VALIDATION_ERROR (422) — request body didn't match schema. Check error.details.
  • SLUG_CONFLICT (409) — another resource already uses this slug

Models and backends

  • MODEL_NOT_FOUND (404) — no frontend model with that slug
  • MODEL_ACCESS_DENIED (403) — model exists but not enabled for this tenant
  • MODEL_UNAVAILABLE (503) — all backends for this model are unhealthy or circuit-broken
  • BACKEND_NOT_FOUND (404) — backend doesn't exist
  • BACKEND_ERROR (502) — upstream provider returned an error
  • BACKEND_TIMEOUT (504) — upstream didn't respond in time
  • BACKEND_RATE_LIMITED (429) — upstream rate-limited us; includes retry_after
  • UPSTREAM_SHAPE_MISMATCH (502) — upstream sent a response our parsers didn't accept

Rate limits and quotas

  • RATE_LIMITED (429) — ScaiGrid's own rate limiter triggered
  • QUOTA_EXCEEDED (429) — request quota exceeded
  • BUDGET_EXCEEDED (429) — usage budget exceeded

Tenant / partner

  • PARTNER_NOT_FOUND (404)
  • TENANT_NOT_FOUND (404)
  • TENANT_SUSPENDED (403) — tenant is administratively suspended
  • PARTNER_SUSPENDED (403)

Modules

  • MODULE_NOT_FOUND (404)
  • MODULE_NOT_ENABLED (403) — module isn't enabled for this tenant
  • MODULE_DEPENDENCY_UNAVAILABLE (424) — required upstream module isn't available

Resources

  • USER_NOT_FOUND (404)
  • API_KEY_NOT_FOUND (404)
  • GROUP_NOT_FOUND (404)
  • ROOM_NOT_FOUND (404)
  • SESSION_NOT_FOUND (404)
  • ROUTING_POLICY_NOT_FOUND (404)
  • BUDGET_NOT_FOUND (404)
  • WEBHOOK_NOT_FOUND (404)
  • BATCH_NOT_FOUND (404)
  • CHECKPOINT_NOT_FOUND (404)

Service

  • SERVICE_DRAINING (503) — gateway is gracefully shutting down
  • SERVICE_UNAVAILABLE (503) — a dependency (Redis, MariaDB) is down

Module-specific codes are listed in each module's reference page and follow the convention {MODULE}_{NAME} (e.g., SCAIQUEUE_MESSAGE_NOT_FOUND).

Full reference#

See Error Codes Reference for the complete, exhaustive list.

Sample error handler#

A minimal Python handler that classifies and retries correctly:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
import httpx
import time

RETRYABLE = {
    "BACKEND_RATE_LIMITED", "BACKEND_TIMEOUT", "BACKEND_ERROR",
    "MODEL_UNAVAILABLE", "RATE_LIMITED", "SERVICE_DRAINING",
}

class ScaiGridError(Exception):
    def __init__(self, code, message, retry_after=None, request_id=None):
        self.code = code
        self.message = message
        self.retry_after = retry_after
        self.request_id = request_id
        super().__init__(f"{code}: {message} (request_id={request_id})")

def call_with_retry(method, url, *, max_attempts=3, **kwargs):
    for attempt in range(max_attempts):
        resp = httpx.request(method, url, **kwargs)
        body = resp.json()
        rid = resp.headers.get("X-Scaigrid-Request-Id")
        if body.get("status") == "error":
            err = body["error"]
            if err["code"] not in RETRYABLE or attempt == max_attempts - 1:
                raise ScaiGridError(err["code"], err["message"],
                                     err.get("retry_after"), rid)
            time.sleep(err.get("retry_after") or (2 ** attempt))
            continue
        return body["data"]
    raise RuntimeError("unreachable")

What's next#

Updated 2026-05-18 15:01:29 View source (.md) rev 17