Troubleshooting

A short list of things that go wrong and how to fix them. If none of these match, check the request id in the response envelope and grep the ScaiGrid logs.

Open your browser's developer console. The widget logs an obvious error if it can't initialise.
Check the script src attribute is actually loading — usually a CSP script-src issue.
Check data-bot-id is a real bot id (not the slug).
Check data-token is a fresh embed token or data-token-url points at a working endpoint.

The chat POST is being blocked or 4xx-ing.

403 — token expired or revoked. Check token TTL; mint a new one.
404 — wrong bot id or the bot's status is archived.
502 / 504 — ScaiGrid is upstream-unhealthy; check /health/ready.
CSP error in console — connect-src must allow https://scaigrid.scailabs.ai.

Bot replies with "I don't have that information"#

Knowledge retrieval returned nothing relevant. Causes:

Documents haven't indexed yet. Check GET /bots/{id}/documents — all should show status: indexed.
Knowledge disabled. knowledge_enabled is false on the bot.
Wrong collection. In linked mode, knowledge_collection_id points at a collection the bot's tenant doesn't have access to.
Score threshold too high. Drop knowledge_settings.score_threshold (default 0.3 — try 0.2 if your corpus is small).
Query mismatch. Visitor's wording doesn't match the document's vocabulary; teach the bot synonyms via terminology on the tone.

Bot confabulates#

Says confident things that aren't in the docs.

Make sure knowledge_enabled is on. (Sounds silly — easy to forget on a re-created bot.)
Tighten knowledge_settings.score_threshold (raise it: 0.4–0.5).
Add a custom_instructions line: "Answer only from the provided context. If the context doesn't contain the answer, say you don't know."
Add a confidence-trigger escalation as a safety net.

Citations point at the wrong document#

Two near-duplicate documents in the index — typical with multiple versions of the same PDF.

Delete the duplicate(s).
Set max_chunks_per_doc: 1 to force diversity.
Set deduplicate: true (default true; double-check).

Escalation rule never fires#

In priority order:

Keyword: check match_mode (substring vs regex). Substring is case-sensitive; regex needs to be valid Python regex.
Intent: raise the threshold if the classifier is too cautious, or lower if too aggressive. Try the live preview at POST /bots/{id}/escalations/preview to see what the classifier scores against your visitor messages.
Sentiment: the threshold is on a -1 to +1 scale; -0.6 means "noticeably negative." For "annoyed but polite" set -0.3.
Confidence: the bot's self-reported confidence is heuristic. Combine with consecutive_low_turns: 2 to avoid one-off misfires.
Explicit: language-specific. If your visitors write in a language other than the bot's language, set strict_language: false so detection still works.

Escalation rule fires too often#

Usually the keyword list is too broad ("help" is a terrible escalation trigger) or the sentiment threshold is too lax.

Reduce keyword list to terms unambiguous to your domain ("refund" yes, "money" no).
Raise sentiment threshold (closer to zero = stricter — fewer trigger).
Move the rule down the priority list so cheaper / more specific rules match first.

Document upload returns 500 / fails to index#

Check the file isn't password-protected (encrypted PDFs are rejected).
Check the document has selectable text. Scanned PDFs are OCR'd but quality varies; if OCR fails, the document is marked failed.
Try a smaller document — if a 30 MB PDF fails, the timeout may be the culprit. Split and re-upload.
Check the document content — corrupted PDFs from old scanners are surprisingly common.

Tokens spent per conversation feels high#

Check max_context_messages — default 20 is a lot of history. Drop to 10 if your conversations are usually short.
Check max_tokens_per_response — 500 is generous. Drop to 300 if responses don't need to be long.
Check knowledge_settings.top_k — each retrieved chunk costs tokens. Drop from 5 to 3 if your chunks are large.
Verbose tone burns tokens. verbosity: "concise" is the default for a reason.

Conversations not attributing to logged-in users#

The widget's data-user-* attributes are advisory. Mint embed tokens server-side with visitor_id baked in.
Check the token's payload (decode it client-side in dev tools) — the visitor id should be there.
The widget caches its conversation id per visitor. If a logged-in user clears cookies, they'll start a new conversation under a new visitor id; expected behaviour.

Webhook action fires but downstream never receives#

ScaiGrid retries 3 times with exponential backoff. After that, the escalation event is marked delivery_failed in the conversation log.
Check your endpoint accepts POSTs with Content-Type: application/json and HMAC headers.
Test signature verification with a fixed timestamp + body — sometimes the body bytes differ between what ScaiGrid signs and what your framework receives (Content-Length differences from middleware).

Conversations dropping after a few minutes of idleness#

By design — conversations time out after 30 minutes of inactivity and the visitor sees the welcome message on their next interaction. Override per-bot with conversation_idle_timeout_seconds.