Platform
ScaiWave ScaiGrid ScaiCore ScaiBot ScaiDrive ScaiKey Models Tools & Services
Solutions
Organisations Developers Internet Service Providers Managed Service Providers AI-in-a-Box
Resources
Support Documentation Blog Downloads
Company
About Research Careers Investment Opportunities Contact
Log in

Architecture

ScaiBot is a thin product layer over ScaiGrid's existing primitives — inference, sessions, accounting, and the ScaiMatrix knowledge engine. There is no separate "bot engine"; a bot is a configuration that orchestrates these primitives.

Components#

flowchart LR V[Visitor] YS["Your site<br/>&lt;script&gt;..."] BC["Bot config<br/>Tone<br/>Knowledge<br/>Conv. log<br/>Escalation"] INF["ScaiGrid inference<br/>+ ScaiMatrix retrieval"] V -- HTTP page --> YS YS -- widget.js --> V V <-- chat messages (SSE stream) --> YS YS <-- WS/SSE --> BC BC --> INF subgraph SG ["ScaiGrid &mdash; /v1/modules/scaibot/..."] BC INF end

There's no separate ScaiBot deployment. ScaiBot is a ScaiGrid module — it runs in the same FastAPI process, behind the same auth, accounted against the same budgets.

Request flow for one chat turn#

  1. Widget sends POST /v1/modules/scaibot/chat with the bot id, conversation id (or a fresh one on the first turn), and the visitor's message.
  2. Auth validates the embed token, resolves it to the bot and visitor identity.
  3. Bot config + tone are loaded; together they produce the system prompt.
  4. Knowledge retrieval. If knowledge_enabled, the user's message is embedded and matched against the bot's knowledge collection (managed or linked). Top-K chunks are gathered.
  5. Escalation pre-check. Keyword and explicit-request rules fire here (before tokens are spent) — if matched, the bot returns the escalation message instead of generating.
  6. Inference. ScaiGrid's chat completion endpoint is called with the system prompt + retrieved chunks + recent conversation history. The response is streamed back to the widget over SSE.
  7. Escalation post-check. Intent, sentiment, and confidence rules run on the generated response — if matched, the action fires (email/webhook/Slack/queue).
  8. Accounting. Tokens, latency, retrieval count, and escalation status are recorded.
  9. Conversation store. The full turn (user message, retrieved chunks, generated response, escalation outcome) is persisted.

State#

  • Bots, tones, escalation rules, knowledge collections — in ScaiGrid's MariaDB.
  • Knowledge chunks + embeddings — in ScaiMatrix (Weaviate underneath, but you talk to ScaiMatrix's API).
  • Conversations + messages — partitioned tables in MariaDB; pruned by tenant retention policy.
  • Embed tokens — short-lived, signed; not persisted longer than necessary.
  • No client-side state matters — the widget only holds a conversation id in a cookie. Losing it just starts a new conversation.

How it differs from calling inference directly#

A direct ScaiGrid chat-completion call gives you tokens-out. ScaiBot adds:

Concern Direct call ScaiBot
System prompt You write it Generated from tone config
Knowledge retrieval You orchestrate it Built-in; toggled with a boolean
Conversation persistence You build it Built-in
Escalation You wire it Rule-based, built-in
Embeddable widget You build it Single <script> tag
Analytics You instrument it Built-in dashboards

For a one-shot completion or a custom UI, call inference directly. For a chatbot product, use ScaiBot — the savings are most of the surface area.

Where the trust boundary is#

The embed token authenticates the bot, not the visitor. The widget runs in the visitor's browser and is fully observable; do not embed admin credentials in it. Visitor identity (if any) is optional and is sent via data-user-id / data-user-email attributes, with the script tag's token as the only authority. For authenticated-visitor flows where you trust the visitor's identity, generate tokens server-side per visitor with their identity baked in.

Updated 2026-05-18 15:01:26 View source (.md) rev 17