Platform
ScaiWave ScaiGrid ScaiCore ScaiBot ScaiDrive ScaiKey Models Tools & Services
Solutions
Organisations Developers Internet Service Providers Managed Service Providers AI-in-a-Box
Resources
Support Documentation Blog Downloads
Company
About Research Careers Investment Opportunities Contact
Log in

Troubleshooting

A short list of things that go wrong and how to fix them. If none of these match, check the request id in the response envelope and grep the ScaiGrid logs.

/speak returns 502 SCAISPEAK_BACKEND_UNAVAILABLE#

The selected backend isn't reachable.

  • Backend A: no ScaiInfer node has the configured TTS engine loaded. Check /v1/infra/nodes for online nodes carrying the engine.
  • Backend B: the managed TTS relay is unreachable or the deployment's relay credentials are missing. Check the readiness endpoint at /v1/modules/scaispeak/readyz.
  • Tenant locked to a backend that's down: GET /admin/policy — if allowed_backends is ["A"] and there's no A node, every call 502s. Add B to the allowed set or wait for A to come back.

Preflight rejects the reference clip#

SCAISPEAK_VOICE_PREFLIGHT_FAILED returns the preflight block — read it. Common causes:

  • Too short (fail_reasons: ["duration_below_minimum"]). 5 seconds is the floor — record more speech.
  • Too long (duration_above_maximum). Trim to 25 seconds.
  • Mostly silence (low_voice_activity_ratio). Re-record actually speaking, not breathing pauses.
  • Clipping (peak_dbfs_above_threshold). Lower the recording gain.
  • Noisy (snr_below_threshold). Background noise. Re-record in a quieter room or with closer mic placement.

Voice stuck at embedding_status: processing#

This is a legacy state from before the zero-shot cloning engine. New voices created via the current intake pipeline land at ready directly. If you have rows stuck in processing:

  • Voice older than the current engine. Run POST /voices/{id}/repromote to re-run intake. The row should flip to ready (or failed with a clearer reason).
  • Stale row from a failed intake. Delete the voice and re-create it.

Voice ends at embedding_status: failed#

The intake pipeline rejected the voice. Common embedding_status_reason values:

  • reference_too_short — reference clip is under the 5 s minimum.
  • reference_unavailable — the reference WAV isn't readable from object storage. Re-upload the voice.
  • tokeniser_unavailable — legacy state from the previous engine; should not occur on new voices. repromote fixes it.
  • ecapa_cosine_below_threshold:<x> — legacy quality gate failure from the previous engine. New voices skip this gate.

save_to returns SCAISPEAK_SAVE_TO_REQUIRES_JWT#

You called /speak with an sgk_ API key plus a save_to block. API keys can't perform ScaiKey token exchange. Re-authenticate as a JWT (OAuth flow against ScaiKey) and retry, or drop save_to and let the audio come back inline.

save_to returns SCAISPEAK_SAVE_TO_EXCHANGE_FAILED#

ScaiKey rejected the token exchange against the ScaiDrive audience. Causes:

  • ScaiDrive application isn't in token_exchange_allowed. Operator fix on the ScaiKey side.
  • JWT scope doesn't grant files:write files:read. Re-authenticate with the right scopes.
  • The caller's JWT expired between request and exchange. Retry with a fresh token.

save_to returns SCAISPEAK_SCAIDRIVE_CONFLICT#

A file with the same name exists in the destination and overwrite is false. Either rename the destination file, pass overwrite: true, or pick a different filename.

WebSocket closes with 4502 immediately after open#

Same root cause as the HTTP 502 — backend resolution failed before the first audio frame. Check tenant policy and node availability.

WebSocket frames arrive but audio is garbled#

You're decoding the wrong codec. The output.codec you sent in open (or its default opus) is what the binary frames carry. Decode with that codec, not whatever your client guessed from the URL.

POST /voices/{id}/warm returns 409 SCAISPEAK_VOICE_NOT_READY_FOR_WARMING#

Warming is a legacy operation from the previous-generation cloning architecture. On the current zero-shot engine, the endpoint is a no-op: the reference clip is shipped with each synth request directly. If you're getting 409 on a voice that's ready, you're hitting a stale handler — file a bug. For new code, you can drop the /warm call entirely; voices become usable as soon as intake clears.

WebRTC endpoints return 501 SCAISPEAK_WEBRTC_UNAVAILABLE#

The deployment doesn't have aiortc + av installed. WebRTC requires those Python packages; operators install with pip install aiortc av. Once installed, restart the API process.

Synth job stuck in queued#

The process_synth_job arq worker isn't running. Same diagnostic path as the tokenisation stuck case — check arq workers, restart the pool, retry the job.

Erasure left some warm replicas behind#

DELETE /voices/{id} returns error_summary when individual EvictVoice RPCs fail. The Redis registry is always cleared (the registry represents intent, and intent is now "no") but a node that was offline at deletion time might still have the prefix tokens cached locally. The erasure-reconciler worker reconciles those when the node comes back; you can also call POST /voices/{id}/evict again after the node recovers.

ScaiDrive shares endpoint returns 404 SCAISPEAK_SCAIDRIVE_NOT_AVAILABLE#

The deployment's scaispeak_scaidrive_url setting is empty. The synth page falls back to a "ScaiDrive not configured" state — operator fix only.

Listing voices shows nothing#

  • Check your token has scaispeak:voice.read.
  • Check there are global voices in the deployment (scope=global filter). A fresh tenant on a deployment with no global voices and no own cloned voices sees an empty list.
  • Check you didn't filter to embedding_status=ready on a tenant whose only voices are processing.
Updated 2026-05-22 14:27:31 View source (.md) rev 13