Tune AI response mode
The two modes:
- Streaming: tokens appear in one growing message as the AI types. Good for short replies, code blocks, anything you read top-to-bottom.
- Conversational: paragraphs flush as separate messages on boundary detection. Good for long-form replies; the AI feels like someone typing follow-ups rather than dumping a wall.
Default mode for new rooms is conversational. You can override
globally, per room, or per-AI within a room.
Where to change it#
Per (room × AI)#
Chat header → mode toggle (one click for 1-AI rooms; per-AI dropdown for multi-AI rooms).
Via slash command:
1 2 3 | |
Via the API:
1 2 | |
participant_id=null flips every AI in the room.
Global user default#
Settings → AI → Default response mode. Affects every new room you join. Existing rooms keep their per-room setting.
When room-wide flips persist to your profile default#
A room-wide mode flip (no participant_id) updates your profile
default. Per-AI flips do not — tweaking one AI in one room
shouldn't change your global preference. This is deliberate.
Heuristic for picking#
Pick streaming when:
- Replies are typically short (< 200 words).
- You want to read top-to-bottom as it generates.
- The reply is mostly code — you want to see the code form line by line.
Pick conversational when:
- Replies are long and structured.
- You want to interrupt mid-flow — easier to react to message 2 of 5 than to a single growing wall.
- The AI is doing iterative reasoning ("first I'll …, then I'll …").
How "paragraph boundary" detection works#
In conversational mode, the streaming response is buffered. When the text crosses a paragraph boundary (a blank line, end of a fenced code block, end of a list item with a following blank), the buffer flushes as a new message. There's also a max-length safety so a single super-long paragraph still gets sent.
If the model decides to stop mid-paragraph, that final buffer flushes regardless.
Sampling parameters#
Mode is one knob; the others (temperature, top_p, max_tokens,
frequency_penalty, presence_penalty, reasoning_effort) are
per-room too. Set under Engagement → Sampling or via:
1 2 | |
The prompt studio is the easiest way to dial these in — preview before applying.