Platform
ScaiWave ScaiGrid ScaiCore ScaiBot ScaiDrive ScaiKey Models Tools & Services
Solutions
Organisations Developers Internet Service Providers Managed Service Providers AI-in-a-Box
Resources
Support Documentation Blog Downloads
Company
About Research Careers Investment Opportunities Contact
Log in

Tune AI response mode

The two modes:

  • Streaming: tokens appear in one growing message as the AI types. Good for short replies, code blocks, anything you read top-to-bottom.
  • Conversational: paragraphs flush as separate messages on boundary detection. Good for long-form replies; the AI feels like someone typing follow-ups rather than dumping a wall.

Default mode for new rooms is conversational. You can override globally, per room, or per-AI within a room.

Where to change it#

Per (room × AI)#

Chat header → mode toggle (one click for 1-AI rooms; per-AI dropdown for multi-AI rooms).

Via slash command:

bash
1
2
3
/mode chat            # → conversational
/mode continuous      # → streaming
/mode <ai-name> chat  # multi-AI per-AI form

Via the API:

bash
1
2
PUT /v1/rooms/{room_id}/ai/response-mode
{ "mode": "streaming", "participant_id": "<ai-id-or-null>" }

participant_id=null flips every AI in the room.

Global user default#

Settings → AI → Default response mode. Affects every new room you join. Existing rooms keep their per-room setting.

When room-wide flips persist to your profile default#

A room-wide mode flip (no participant_id) updates your profile default. Per-AI flips do not — tweaking one AI in one room shouldn't change your global preference. This is deliberate.

Heuristic for picking#

Pick streaming when:

  • Replies are typically short (< 200 words).
  • You want to read top-to-bottom as it generates.
  • The reply is mostly code — you want to see the code form line by line.

Pick conversational when:

  • Replies are long and structured.
  • You want to interrupt mid-flow — easier to react to message 2 of 5 than to a single growing wall.
  • The AI is doing iterative reasoning ("first I'll …, then I'll …").

How "paragraph boundary" detection works#

In conversational mode, the streaming response is buffered. When the text crosses a paragraph boundary (a blank line, end of a fenced code block, end of a list item with a following blank), the buffer flushes as a new message. There's also a max-length safety so a single super-long paragraph still gets sent.

If the model decides to stop mid-paragraph, that final buffer flushes regardless.

Sampling parameters#

Mode is one knob; the others (temperature, top_p, max_tokens, frequency_penalty, presence_penalty, reasoning_effort) are per-room too. Set under Engagement → Sampling or via:

bash
1
2
PUT /v1/rooms/{room_id}/ai/temperature
{ "value": 0.7, "participant_id": null }

The prompt studio is the easiest way to dial these in — preview before applying.

Where to go next#

Updated 2026-05-18 12:07:10 View source (.md) rev 2