Tutorial: Multi-model flow
Use a cheap model for triage/classification and a smart model for response generation, all in one flow.
Why#
LLM-driven classification on every inbound message gets expensive fast if you point your premium model at it. A smaller model (GPT-4o-mini, Haiku, etc.) is usually accurate enough for routing intent + extracting fields, freeing the premium model for the steps that actually need its reasoning.
ScaiFlow's model registry makes this a two-line change.
Steps#
1. Open or create a flow#
Use any flow with at least one LLM block. For this tutorial, the Customer Support example already uses two roles — open it via Catalog… → Examples → Customer Support.
2. Declare two roles in the registry#
In the Flow properties panel, expand Models:
- The first row is pinned as
primary. Pick a smart model — e.g.openai/gpt-4ofrom your ScaiGrid catalog. - Click + add model. A new row appears with role
fast. Pick a cheap model —openai/gpt-4o-minior whatever your catalog lists.
The resulting registry on the wire:
"models": [
{ "role": "primary", "ref": "scaigrid", "model": "openai/gpt-4o",
"modalities": ["text", "structured_output"] },
{ "role": "fast", "ref": "scaigrid", "model": "openai/gpt-4o-mini",
"modalities": ["text"] }
]
3. Pick the right role per LLM block#
Click each LLM node and change its Model role dropdown:
- Classifier node (Flexible Prompt): role
fast. Quick intent extraction; cheap. - Response generator node (Guarded Prompt): role
primary. Quality matters; cost is amortized over the value of the response.
The dropdown is reactive — adding/renaming a role in the registry immediately updates every LLM node's dropdown.
4. (Optional) Add a fallback chain#
For mission-critical roles, configure a fallback in the registry. Click the role's row → Advanced → Fallback chain → + add fallback. Pick a backup model:
{ "role": "primary",
"ref": "scaigrid",
"model": "openai/gpt-4o",
"fallback": [
{ "ref": "openai", "model": "gpt-4-turbo" }, // direct-to-OpenAI as backup
{ "ref": "scaigrid", "model": "anthropic/claude-3-opus" }
] }
The runtime tries the primary first; on provider errors it walks the fallback list in order.
5. (Optional) Set per-role temperature#
Different roles often want different temperatures. Click a role → Advanced → Temperature:
- Classifier (
fast):0.2(deterministic-ish; don't want intent labels to swing). - Response generator (
primary):0.7(default; bit of variation in phrasing). - Critic (if you add one):
0.1(deterministic verdicts).
6. Compile + verify#
In the live preview pane (right sidebar, bottom), the YAML manifest now carries both roles under core.scaiflow_meta.models. The IR shape (via scaiflow compile flow.json --format ir) puts them under manifest.models[]:
1 2 3 4 5 6 7 8 9 | |
7. Deploy#
Standard deploy. ScaiCore's runtime reads manifest.models, registers both as @models declarations, and routes each LLM block's call to the right one based on its llm_role.
Compile-time safety#
If you remove a role from the registry that a node references, compile fails with:
1 2 3 | |
The canvas surfaces this inline too — the LLM node's role dropdown renders the orphan role in select-error styling with a pointer back to Flow → Models.
What you learned#
- The model registry lives at flow level; LLM nodes reference roles by name.
- The first registry entry is the implicit
primarydefault. - The catalog picker pulls from your tenant's ScaiGrid frontend models; "Custom" is the escape hatch for legacy/uncatalogued slugs.
- Fallback chains + per-role temperature + per-role system context are all optional knobs in the registry editor's Advanced section.
Next#
- Concepts: Models and the registry — the full registry shape, IR field names, YAML round-trip.