Tutorial: Multi-model flow

Use a cheap model for triage/classification and a smart model for response generation, all in one flow.

Why#

LLM-driven classification on every inbound message gets expensive fast if you point your premium model at it. A smaller model (GPT-4o-mini, Haiku, etc.) is usually accurate enough for routing intent + extracting fields, freeing the premium model for the steps that actually need its reasoning.

ScaiFlow's model registry makes this a two-line change.

Steps#

1. Open or create a flow#

Use any flow with at least one LLM block. For this tutorial, the Customer Support example already uses two roles — open it via Catalog… → Examples → Customer Support.

2. Declare two roles in the registry#

In the Flow properties panel, expand Models:

The first row is pinned as primary. Pick a smart model — e.g. openai/gpt-4o from your ScaiGrid catalog.
Click + add model. A new row appears with role fast. Pick a cheap model — openai/gpt-4o-mini or whatever your catalog lists.

The resulting registry on the wire:

jsonc

"models": [
  { "role": "primary", "ref": "scaigrid", "model": "openai/gpt-4o",
    "modalities": ["text", "structured_output"] },
  { "role": "fast", "ref": "scaigrid", "model": "openai/gpt-4o-mini",
    "modalities": ["text"] }
]

3. Pick the right role per LLM block#

Click each LLM node and change its Model role dropdown:

Classifier node (Flexible Prompt): role fast. Quick intent extraction; cheap.
Response generator node (Guarded Prompt): role primary. Quality matters; cost is amortized over the value of the response.

The dropdown is reactive — adding/renaming a role in the registry immediately updates every LLM node's dropdown.

4. (Optional) Add a fallback chain#

For mission-critical roles, configure a fallback in the registry. Click the role's row → Advanced → Fallback chain → + add fallback. Pick a backup model:

jsonc

{ "role": "primary",
  "ref": "scaigrid",
  "model": "openai/gpt-4o",
  "fallback": [
    { "ref": "openai", "model": "gpt-4-turbo" },   // direct-to-OpenAI as backup
    { "ref": "scaigrid", "model": "anthropic/claude-3-opus" }
  ] }

The runtime tries the primary first; on provider errors it walks the fallback list in order.

5. (Optional) Set per-role temperature#

Different roles often want different temperatures. Click a role → Advanced → Temperature:

Classifier (fast): 0.2 (deterministic-ish; don't want intent labels to swing).
Response generator (primary): 0.7 (default; bit of variation in phrasing).
Critic (if you add one): 0.1 (deterministic verdicts).

6. Compile + verify#

In the live preview pane (right sidebar, bottom), the YAML manifest now carries both roles under core.scaiflow_meta.models. The IR shape (via scaiflow compile flow.json --format ir) puts them under manifest.models[]:

python
{
  "manifest": {
    "models": [
      {"role": "primary", "provider": "scaigrid", "model": "openai/gpt-4o", ...},
      {"role": "fast", "provider": "scaigrid", "model": "openai/gpt-4o-mini", ...}
    ],
    ...
  }
}

7. Deploy#

Standard deploy. ScaiCore's runtime reads manifest.models, registers both as @models declarations, and routes each LLM block's call to the right one based on its llm_role.

Compile-time safety#

If you remove a role from the registry that a node references, compile fails with:

text

1
2
3

llm_role validation failed
  node 'node_xyz': llm_role='fast' is not declared in
  flow.config.core_identity.models (declared: ['primary'])

The canvas surfaces this inline too — the LLM node's role dropdown renders the orphan role in select-error styling with a pointer back to Flow → Models.

What you learned#

The model registry lives at flow level; LLM nodes reference roles by name.
The first registry entry is the implicit primary default.
The catalog picker pulls from your tenant's ScaiGrid frontend models; "Custom" is the escape hatch for legacy/uncatalogued slugs.
Fallback chains + per-role temperature + per-role system context are all optional knobs in the registry editor's Advanced section.

Next#

Concepts: Models and the registry — the full registry shape, IR field names, YAML round-trip.