Models and the registry

A flow declares its model registry at the flow level. Each entry is keyed by a role alias — primary, fast, voice, critic, etc. — that LLM blocks reference by name via llm_role.

Shape#

jsonc

"core_identity": {
  "name": "My Core",
  "models": [
    {
      "role": "primary",
      "ref": "scaigrid",
      "model": "openai/gpt-4o",
      "modalities": ["text", "structured_output"],
      "temperature": 0.5
    },
    {
      "role": "fast",
      "ref": "scaigrid",
      "model": "openai/gpt-4o-mini",
      "modalities": ["text"],
      "fallback": [{ "ref": "openai", "model": "gpt-3.5-turbo" }]
    }
  ]
}

Required fields#

role — lowercase identifier matching ^[a-z][a-z0-9_]*$. Must be unique within the registry. LLM nodes reference this via llm_role (or the legacy model alias on node.config, which the compiler still accepts).
ref — provider key. Common values: scaigrid (the ScaiLabs aggregator; pick this for anything in your tenant's ScaiGrid catalog), scaiinfer (legacy/direct ScaiInfer endpoint), openai, partner-specific keys.
model — model identifier within the provider. For ref: "scaigrid", this is the slug ScaiGrid uses (openai/gpt-4o, scailabs/poolnoodle-omni, tenant/{slug}/...).

Optional fields#

temperature — 0.0 – 2.0. For text modalities.
modalities — list from text, structured_output, vision, tts, stt, embedding, image_generation. Default ["text"]. Tells the runtime which capabilities to expect; planned future filtering in the picker.
system_context — system prompt override (text modalities only).
fallback — ordered list of {ref, model} pairs to try if the primary is unavailable.

The implicit default#

The first entry in models[] is the implicit primary default for any LLM node that doesn't pin its own role. The schema enforces minItems: 1, so there's always at least one. The canvas pins the first row's role name to "primary" (you can't rename or delete it; you can change its ref/model).

How LLM nodes reference roles#

Each llm_* node carries node.config.llm_role (canonical) — the codegen still accepts the legacy node.config.model alias as a fallback for older flows. The canvas writes both keys so imported flows that carry only one still work.

The compiler validates references at compile time. If an LLM node says llm_role: "fastt" (typo) and the registry doesn't declare a role named fastt, the compiler refuses with:

text

1
2
3

llm_role validation failed
  node 'node_xyz': llm_role='fastt' is not declared in
  flow.config.core_identity.models (declared: ['fast', 'primary'])

The canvas surfaces the same problem inline — a role select for an undeclared role renders in select-error styling with a "declare it in Flow → Models" pointer.

The picker#

The canvas Model registry editor (Flow → Models) exposes a picker that reads from your tenant's ScaiGrid frontend-model catalog via GET /v1/scaigrid/models. Results are grouped by slug prefix (openai/, scailabs/, tenant/{slug}/) and labelled with display name + context window where ScaiGrid provides them.

A "Custom" escape hatch keeps a free-text ref + model pair available for legacy slugs (scaiinfer/poolnoodle/babynoodle, partner-scoped slugs) and any model ScaiGrid doesn't yet catalog. Switching between catalog and custom preserves the existing values so legacy entries don't get silently rewritten.

YAML round-trip#

The compiled YAML manifest keeps core.model: "<ref>/<model>" as the implicit-primary single-string (what ScaiGrid's YAML loader accepts today) AND surfaces the full registry under core.scaiflow_meta.models. The importer prefers the meta block when present, so canvas → YAML → canvas round-trips lossless.

If you author YAML by hand without the meta block, the importer lifts the single core.model into a one-entry models: [{role: 'primary', ref, model}]. That's lossy but compiling.

What it looks like in ScaiCore source#

The registry corresponds to ScaiCore's @models { ... } block. A flow with two roles emits:

scaicore
@models {
    primary = {
        provider = "scaigrid"
        model = "openai/gpt-4o"
        temperature = 0.5
        modalities = [:text, :structured_output]
    }
    fast = {
        provider = "scaigrid"
        model = "openai/gpt-4o-mini"
        modalities = [:text]
        fallback += { provider = "openai", model = "gpt-3.5-turbo" }
    }
}

LLM blocks then reference roles by name:

scaicore
classify = @flexible {
    model = "fast"
    goal = "Classify the customer's intent."
    output: Classification
}

respond = @guarded {
    model = "primary"
    goal = "Generate the response."
    guard: confidence > 0.7
    validate: result.citations.length > 0
}

IR shape#

The IR codegen emits one IRModelDecl per role on IRModule.models:

python
{
  "role": "primary",
  "provider": "scaigrid",          # IR field is `provider`, schema's is `ref`
  "model": "openai/gpt-4o",
  "modalities": ["text", "structured_output"],
  "temperature": 0.5,
  "system_context": "...",         # optional
  "fallback": [{ "provider": "openai", "model": "gpt-3.5-turbo" }]  # optional
}

See SCAICORE-COMPILER-IR.md §8.2 for the canonical IRModelDecl shape.

Patterns#

One small model for triage, one big model for the reasoning#

jsonc

"models": [
  { "role": "primary", "ref": "scaigrid", "model": "openai/gpt-4o" },
  { "role": "fast", "ref": "scaigrid", "model": "openai/gpt-4o-mini" }
]

Use fast on the intent-classification node (Flexible Prompt with llm_role: "fast"), use primary on the response-generation node. The fast role doesn't need a separate provider — just declare it in the registry once.

Embeddings + chat in one flow#

jsonc

"models": [
  { "role": "primary", "ref": "scaigrid", "model": "openai/gpt-4o",
    "modalities": ["text"] },
  { "role": "embedder", "ref": "scaigrid", "model": "scai-embed-1",
    "modalities": ["embedding"] }
]

The embedder role is for future @model_call blocks (non-text modality calls). LLM nodes (@flexible/@guarded) only support text/structured-output modalities today; pick a role whose modality list includes those.

Critic role for self-review#

jsonc

"models": [
  { "role": "primary", "ref": "scaigrid", "model": "openai/gpt-4o" },
  { "role": "critic", "ref": "scaigrid", "model": "openai/gpt-4o",
    "temperature": 0.2 }
]

Same model, different temperature. The critic role goes on the validate/review step; the primary handles the main reasoning.