IR specification
Canonical source. This page is now the source of truth for the ScaiCore IR specification. The earlier copy at docs/SCAICORE-COMPILER-IR.md in the repo is a historical snapshot kept for archaeological reference; edits should land here.
Version: 0.1
Status: Design
Resolves: Gap #10 (Compiler IR Specification)
Relates to: All other specs (this is the contract between compiler and runtime)
1. Purpose and Design Level
The IR (Intermediate Representation) is the output of the ScaiCore compiler
and the input to the ScaiCore runtime. It is the formal contract that
allows the compiler and runtime to be built, tested, and evolved independently.
1.1 Abstraction Level
The IR is a high-level tree representation — closer to an AST than to
bytecode. This is intentional:
- ScaiCore's performance bottleneck is LLM calls (seconds), not loop iteration
(microseconds). A tree-walking interpreter is perfectly adequate.
- The IR must be human-inspectable for debugging (
scaicore compile --emit-ir).
- The IR must be serializable for deployment (shipped as a
.scaicore-ir bundle).
- Optimization opportunities are at the orchestration level (parallel LLM calls,
batch plugin requests), not the instruction level.
The runtime executes the IR via a tree-walking evaluator — it traverses
IR nodes and dispatches to handlers. There is no bytecode, no register
allocation, no JIT.
1.2 Compilation Pipeline
flowchart TD
Source([.scaicore source files])
Lexer[Lexer]
Parser[Parser]
Resolver[Resolver]
TypeChecker[Type Checker]
IRBuilder[IR Builder]
Verifier[Verifier]
Serializer[Serializer]
Bundle([.scaicore-ir bundle])
Source --> Lexer
Lexer -->|Token stream| Parser
Parser -->|Concrete Syntax Tree| Resolver
Resolver -->|Name + import resolution| TypeChecker
TypeChecker -->|Types validated + inferred| IRBuilder
IRBuilder -->|IRModule| Verifier
Verifier -->|Static analysis passes| Serializer
Serializer -->|MessagePack| Bundle
2. IRModule — The Top-Level Container
An IRModule is the compiled representation of one @core. It is the
complete, self-contained, deployable unit.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42 | @dataclass
class IRModule:
"""A compiled ScaiCore Core. One per @core declaration."""
# ── Identity ──────────────────────────────────────────────
name: str # Core name (e.g., "InvoiceProcessor")
version: str # Core version (semver)
schema_version: int # IR format version (currently 1)
compiled_at: datetime # When this was compiled
source_hash: str # SHA-256 of source files (for cache invalidation)
# ── Instance Configuration ────────────────────────────────
instance_mode: InstanceMode # How this Core is instantiated
# ── Declarations ──────────────────────────────────────────
types: dict[str, IRType] # Named type definitions
plugins: dict[str, IRPluginDecl] # Plugin declarations
models: dict[str, IRModelDecl] # Model provider declarations (@models / @llm)
memory_schema: dict[str, IRType] # Memory field declarations
reference_schema: dict[str, IRType] # Reference data field declarations
config: list[IRConfigParam] # Configuration parameters
constraints: IRConstraints # Core-level constraints
identity: IRIdentity | None # Core identity/personality
conversation_policy: IRConversationPolicy | None
# ── Triggers & Events ─────────────────────────────────────
triggers: list[IRTrigger] # Inbound trigger declarations
event_subscriptions: list[IREventSubscription] # @on declarations
event_emissions: list[IREventEmission] # Declared event types
# ── Callable Units ────────────────────────────────────────
flows: dict[str, IRFlow] # @flow definitions
transformers: dict[str, IRTransformer] # @transformer definitions
evaluators: dict[str, IREvaluator] # @evaluator definitions
pipelines: dict[str, IRPipeline] # @pipeline definitions
tests: dict[str, IRTestFlow] # @test flow definitions
# ── Core Interface (public API) ───────────────────────────
core_interface: IRCoreInterface | None # What other Cores can call
# ── Manifest (for the Host) ───────────────────────────────
manifest: CoreManifest # Extracted deployment metadata
|
2.1 Instance Mode
| @dataclass
class InstanceMode:
mode: str # "stateless" | "entity" | "singleton"
entity_key: str | None # Field name for entity routing (if entity)
idle_timeout: int | None # Milliseconds (if entity/singleton)
max_concurrent: int # Max concurrent flows per instance
max_active_instances: int | None # Max simultaneous instances (if entity)
overflow: str | None # "deactivate_lru" | "reject" | "queue"
|
2.2 Core Manifest
The manifest is extracted from the IRModule for the Host to use without
loading the full IR. It contains everything the Host needs for routing,
scaling, and resource allocation:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 | @dataclass
class CoreManifest:
"""Deployment metadata — extracted from IRModule for Host consumption."""
name: str
version: str
instance_mode: InstanceMode
triggers: list[TriggerManifestEntry]
event_subscriptions: list[EventSubscriptionEntry]
event_emissions: list[EventEmissionEntry]
required_cores: list[RequiredCoreEntry] # @core_call targets
required_plugins: list[RequiredPluginEntry]
config_schema: list[ConfigSchemaEntry] # For config validation UI
reference_data_sources: list[ReferenceDataEntry]
resource_hints: ResourceHints # Expected LLM usage, memory size, etc.
|
3. Type System IR
3.1 IRType
All ScaiCore types compile to a single discriminated union:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49 | @dataclass
class IRType:
"""A resolved ScaiCore type."""
kind: str
# kind values and their associated fields:
#
# "string" — no extra fields
# "int" — no extra fields
# "float" — no extra fields
# "bool" — no extra fields
# "null" — no extra fields
# "money" — no extra fields
# "date" — no extra fields
# "datetime" — no extra fields
# "duration" — no extra fields
# "email" — no extra fields (validated string subtype)
# "uuid" — no extra fields (validated string subtype)
#
# "array" → element_type: IRType
# "map" → key_type: IRType, value_type: IRType
# "object" → fields: dict[str, IRObjectField]
# "named" → name: str, resolved: IRType (the underlying type)
# "enum" → variants: list[str]
# "string_union" → values: list[str]
# "union" → members: list[IRType]
# "optional" → inner: IRType
# "function" → params: list[IRType], return_type: IRType
# Fields (set based on kind):
element_type: IRType | None = None
key_type: IRType | None = None
value_type: IRType | None = None
fields: dict[str, IRObjectField] | None = None
name: str | None = None
resolved: IRType | None = None
variants: list[str] | None = None
values: list[str] | None = None
members: list[IRType] | None = None
inner: IRType | None = None
params: list[IRType] | None = None
return_type: IRType | None = None
@dataclass
class IRObjectField:
name: str
type: IRType
optional: bool = False # field?: type
|
3.2 Type Resolution
All type references are resolved at compile time. The IR contains no
unresolved type names. For example:
| // Source
type Invoice = { total: money, items: array[LineItem] }
type LineItem = { description: string, amount: money }
|
Compiles to:
| IRType(kind="named", name="Invoice", resolved=IRType(kind="object", fields={
"total": IRObjectField(name="total", type=IRType(kind="money")),
"items": IRObjectField(name="items", type=IRType(kind="array",
element_type=IRType(kind="named", name="LineItem", resolved=IRType(kind="object", fields={
"description": IRObjectField(name="description", type=IRType(kind="string")),
"amount": IRObjectField(name="amount", type=IRType(kind="money")),
}))
))
}))
|
The name field is preserved for error messages and debugging.
The resolved field contains the fully expanded type.
4. Callable Units
4.1 IRFlow
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 | @dataclass
class IRFlow:
"""A compiled @flow."""
name: str
params: list[IRParam] # Declared parameters
return_type: IRType | None # Declared return type (None = void)
body: list[IRBlock] # Sequence of blocks to execute
budget: IRBudget | None # @budget declaration (if any)
is_internal: bool # @internal marker
is_public: bool # pub marker
source_location: SourceLocation # For error messages
@dataclass
class IRParam:
name: str
type: IRType
default: IRExpression | None = None # Default value expression
|
The body is a flat list of IRBlocks executed sequentially. This is the
list that block_index (from Checkpoint Serialization) indexes into.
| @dataclass
class IRTransformer:
"""A compiled @transformer. Similar to IRFlow but semantically different."""
name: str
params: list[IRParam]
return_type: IRType
llm_role: str | None # Which @llm to use (if AI-driven)
body: list[IRBlock]
source_location: SourceLocation
|
4.3 IREvaluator
| @dataclass
class IREvaluator:
"""A compiled @evaluator. Returns assessment, never mutates input."""
name: str
params: list[IRParam]
return_type: IRType
body: list[IRBlock]
source_location: SourceLocation
|
4.4 IRPipeline
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 | @dataclass
class IRPipeline:
"""A compiled @pipeline. Desugared to a sequence of calls."""
name: str
params: list[IRParam]
stages: list[IRPipelineStage] # Ordered stages
source_location: SourceLocation
@dataclass
class IRPipelineStage:
"""One stage of a pipeline."""
operator: str # "|>" | "|?>" | "|*>" | "|~>"
target: str # Name of flow/transformer/evaluator
condition: IRExpression | None # For |?> (conditional)
loop_condition: IRExpression | None # For |~> (loop-until)
max_iterations: int | None # For |~> (bounded)
|
5. Blocks — The Execution Units
Every block compiles to an IRBlock with a discriminated kind field.
The runtime dispatches on kind to the appropriate handler.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55 | @dataclass
class IRBlock:
"""A compiled execution block. The fundamental unit of the IR."""
kind: str # Discriminator (see below)
source_location: SourceLocation # For error messages and debugging
# ── Fields set based on kind ──────────────────────────────
# Each kind uses a different subset of these fields.
# Only the relevant fields are populated (others are None).
# @rigid
statements: list[IRStatement] | None = None
# @flexible
flexible: IRFlexibleBlock | None = None
# @guarded
guarded: IRGuardedBlock | None = None
# @parallel
parallel: IRParallelBlock | None = None
# @foreach
foreach: IRForeachBlock | None = None
# @match
match: IRMatchBlock | None = None
# @while
while_block: IRWhileBlock | None = None
# @checkpoint
checkpoint: IRCheckpointBlock | None = None
# @core_call
core_call: IRCoreCallBlock | None = None
# @await_responses
await_responses: IRAwaitBlock | None = None
# @try / catch
try_catch: IRTryCatchBlock | None = None
# @budget (wraps inner blocks)
budget: IRBudgetBlock | None = None
# @debug
debug: IRDebugBlock | None = None
# @model_call
model_call_block: IRModelCallBlock | None = None
# Assignment target (most blocks assign their result to a variable)
result_binding: str | None = None # Variable name to assign result to
|
5.1 @rigid
| # kind = "rigid"
# Uses: statements
# @rigid compiles to a flat list of IRStatements.
# The verifier has already confirmed:
# - No LLM calls
# - No @flexible or @guarded blocks
# - Only deterministic operations
# - Plugin calls allowed (side effects, but deterministic dispatch)
|
5.2 @flexible
1
2
3
4
5
6
7
8
9
10
11
12
13 | @dataclass
class IRFlexibleBlock:
goal: IRExpression # String expression: the LLM's objective
output_type: IRType # Expected output structure
llm_role: str # Which @llm config to use (e.g., "primary", "fast")
input_bindings: dict[str, IRExpression] # Named inputs
context_bindings: dict[str, IRExpression] # Named context
guidance: IRExpression | None # Additional instructions
identity_ref: bool # Whether to include core.identity
constraints: IRBlockConstraints | None
examples: list[IRExample] | None
on_failure: dict[str, IRFailureHandler] | None
budget: IRBudget | None
|
5.3 @guarded
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 | @dataclass
class IRGuardedBlock:
"""Like @flexible but with hard validation boundaries."""
goal: IRExpression
output_type: IRType
llm_role: str
input_bindings: dict[str, IRExpression]
context_bindings: dict[str, IRExpression]
guidance: IRExpression | None
constraints: IRBlockConstraints | None
# Guarded-specific:
validate: list[IRExpression] # Post-conditions (must all be true)
on_validation_failure: list[IRBlock] | None # Blocks to execute on failure
|
5.4 @parallel
| @dataclass
class IRParallelBlock:
branches: list[IRBlock] # Blocks to execute concurrently
max_concurrent: int | None # Limit on simultaneous branches
fail_fast: bool # Cancel remaining on first failure
|
5.5 @foreach
| @dataclass
class IRForeachBlock:
iterator_var: str # Variable name for current item
collection: IRExpression # Expression that evaluates to iterable
body: list[IRBlock] # Blocks to execute per item
yield_expression: IRExpression | None # What to yield per iteration
|
5.6 @match
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 | @dataclass
class IRMatchBlock:
subject: IRExpression # Expression to match against
arms: list[IRMatchArm]
@dataclass
class IRMatchArm:
pattern: IRPattern # What to match
guard: IRExpression | None # Additional condition (if guard)
body: list[IRBlock] | None # Blocks to execute (or single expression)
expression: IRExpression | None # Single expression result
@dataclass
class IRPattern:
kind: str
# "literal" → value: IRLiteral
# "enum" → symbol: str
# "binding" → name: str (captures the value)
# "wildcard" → _ (matches anything)
# "type" → type: IRType (type check)
value: Any | None = None
symbol: str | None = None
name: str | None = None
type: IRType | None = None
|
5.7 @while
| @dataclass
class IRWhileBlock:
condition: IRExpression # Loop condition
max_iterations: int # Compile-time enforced bound
body: list[IRBlock] # Loop body
|
5.8 @checkpoint
| @dataclass
class IRCheckpointBlock:
checkpoint_type: IRExpression # "approval", "review", etc.
assignee: IRExpression | None
timeout: IRExpression | None # Duration expression
on_timeout: str | None # "fail" | "escalate" | "auto_approve" | "skip"
presentation: dict[str, IRExpression] # Data to show the resolver
options: IRExpression | None # Available choices
on_response: list[IRMatchArm] | None # Response handlers
|
5.9 @core_call
| @dataclass
class IRCoreCallBlock:
target: IRExpression # core:// URI
instance_key: IRExpression | None # Entity key for target instance
version: str | None # Semver range
flow_name: str | None # Specific flow to call (if any)
input_bindings: dict[str, IRExpression]
timeout: IRExpression | None
on_timeout: list[IRBlock] | None # Fallback blocks
is_async: bool # async @core_call
|
5.10 @await_responses
| @dataclass
class IRAwaitBlock:
refs: IRExpression # Expression evaluating to list of refs
strategy: str # "all" | "any" | "at_least" | "majority"
strategy_param: int | None # N for "at_least(N)"
timeout: IRExpression | None
on_timeout: str | None # "escalate" | "fail" | etc.
|
5.11 @try / catch
| @dataclass
class IRTryCatchBlock:
try_body: list[IRBlock]
catch_clauses: list[IRCatchClause]
@dataclass
class IRCatchClause:
error_type: str # Error type name (e.g., "ValidationError")
binding: str # Variable name (e.g., "e")
body: list[IRBlock]
|
5.12 @budget
| @dataclass
class IRBudgetBlock:
max_duration_ms: int | None
max_llm_calls: int | None
max_plugin_calls: int | None
max_memory_writes: int | None
max_retries: int | None
on_exceeded: str # "fail" | "warn"
body: list[IRBlock] # Wrapped blocks
|
5.13 @debug
| @dataclass
class IRDebugBlock:
"""Compiled out in production builds. Included in debug builds."""
body: list[IRStatement]
compile_mode: str # "debug" | "always" (for rare cases)
|
5.14 @model_call
1
2
3
4
5
6
7
8
9
10
11
12 | @dataclass
class IRModelCallBlock:
"""
Non-text AI modality invocation (TTS, STT, embedding, image gen).
Bypasses the Enforcement Engine — no prompt construction or constraints.
"""
model_role: str # References @models declaration (e.g., "voice")
modality: str # "tts" | "stt" | "embedding" | "image_generation" | "audio_generation"
input_bindings: dict[str, IRExpression] # Modality-specific input fields
output_type: IRType # Expected output type
timeout: IRExpression | None
|
6. Statements
Statements appear inside @rigid blocks and as parts of other constructs.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37 | @dataclass
class IRStatement:
kind: str # Discriminator
source_location: SourceLocation
# "assign" → target: str, value: IRExpression
# "return" → value: IRExpression
# "yield" → value: IRExpression
# "break" → (no extra fields)
# "continue" → (no extra fields)
# "emit" → event_name: str, fields: dict[str, IRExpression]
# "if" → condition: IRExpression, then_body: list[IRStatement],
# else_body: list[IRStatement] | None
# "call" → target_flow: str, args: list[IRExpression],
# result_binding: str | None
# "plugin_call" → plugin: str, method: str, args: dict[str, IRExpression],
# result_binding: str | None
# "memory_op" → (see §6.1)
# "log" → level: str, message: IRExpression
# "expression" → expression: IRExpression (for side-effecting expressions)
target: str | None = None
value: IRExpression | None = None
event_name: str | None = None
fields: dict[str, IRExpression] | None = None
condition: IRExpression | None = None
then_body: list | None = None # list[IRStatement]
else_body: list | None = None # list[IRStatement] | None
target_flow: str | None = None
args: list | None = None # list[IRExpression]
result_binding: str | None = None
plugin: str | None = None
method: str | None = None
level: str | None = None
message: IRExpression | None = None
expression: IRExpression | None = None
memory_op: IRMemoryOp | None = None
|
6.1 Memory Operations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25 | @dataclass
class IRMemoryOp:
"""A compiled memory.* operation."""
op: str
# "get" → namespace: str, key: IRExpression
# "set" → namespace: str, key: IRExpression, value: IRExpression
# "update" → namespace: str, key: IRExpression, value: IRExpression
# "delete" → namespace: str, key: IRExpression
# "add" → namespace: str, value: IRExpression
# "list" → namespace: str, prefix: IRExpression | None,
# limit: IRExpression | None, order: str | None
# "search" → namespace: str, query: IRExpression,
# limit: IRExpression | None, min_similarity: IRExpression | None
# "last" → namespace: str, count: IRExpression
namespace: str = ""
key: IRExpression | None = None
value: IRExpression | None = None
prefix: IRExpression | None = None
query: IRExpression | None = None
limit: IRExpression | None = None
min_similarity: IRExpression | None = None
count: IRExpression | None = None
order: str | None = None
|
7. Expressions
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54 | @dataclass
class IRExpression:
kind: str # Discriminator
source_location: SourceLocation
resolved_type: IRType # Type after resolution/inference
# "literal" → value: Any, literal_type: str
# "variable" → name: str
# "member" → object: IRExpression, field: str
# "index" → object: IRExpression, index: IRExpression
# "call" → callee: str, args: list[IRExpression]
# "method_call" → object: IRExpression, method: str, args: list[IRExpression]
# "binary" → left: IRExpression, op: str, right: IRExpression
# "unary" → op: str, operand: IRExpression
# "object" → fields: dict[str, IRExpression]
# "array" → elements: list[IRExpression]
# "lambda" → params: list[IRParam], body: IRExpression
# "ternary" → condition: IRExpression, then_expr: IRExpression,
# else_expr: IRExpression
# "null_coalesce" → left: IRExpression, right: IRExpression (the ?? operator)
# "string_interp" → parts: list[IRExpression] (template literal parts)
# "reference" → field: str (reference.* access)
# "config" → field: str (config.* access)
# "execution" → field: str (execution.* access)
# Fields (set based on kind):
value: Any | None = None
literal_type: str | None = None
name: str | None = None
object_expr: IRExpression | None = None # "object" conflicts with Python
field: str | None = None
index_expr: IRExpression | None = None
callee: str | None = None
args: list | None = None
method: str | None = None
left: IRExpression | None = None
op: str | None = None
right: IRExpression | None = None
operand: IRExpression | None = None
fields_map: dict | None = None
elements: list | None = None
params: list | None = None
body: IRExpression | None = None
condition: IRExpression | None = None
then_expr: IRExpression | None = None
else_expr: IRExpression | None = None
parts: list | None = None
# Binary operators:
# + - * / % == != < > <= >= && || ..(range)
#
# Unary operators:
# ! - (negation)
|
7.1 Literal Types
| # literal_type values:
# "string" → value is str
# "int" → value is int
# "float" → value is float
# "bool" → value is bool
# "null" → value is None
# "money" → value is {"amount": float, "currency": str}
# "date" → value is str (ISO 8601)
# "datetime" → value is str (ISO 8601)
# "duration" → value is int (milliseconds)
# "enum" → value is str (symbol name, e.g., "approved")
|
8. Declarations
8.1 Plugin Declaration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22 | @dataclass
class IRPluginDecl:
alias: str # Local alias (e.g., "crm")
package: str # Package URI (e.g., "company/salesforce@1.0")
interface: IRPluginInterface | None # Resolved interface (if available at compile time)
@dataclass
class IRPluginInterface:
"""Compiled @plugin_interface."""
name: str
methods: dict[str, IRPluginMethod]
@dataclass
class IRPluginMethod:
name: str
params: list[IRParam]
return_type: IRType
latency: str | None # "fast" | "medium" | "slow"
idempotent: bool
readonly: bool
|
8.2 LLM Declaration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 | @dataclass
class IRModelDecl:
role: str # Local alias (e.g., "primary", "fast", "voice")
provider: str # "scaigrid" | "openai"
model: str # Model identifier
temperature: float | None # For text modalities (None for non-text)
modalities: list[str] # ["text", "structured_output", "vision", "tts", ...]
system_context: str | None # System prompt override (text modalities only)
use_when: str | None # Documentation hint
fallback: list[IRModelFallback] | None # Fallback chain
@dataclass
class IRModelFallback:
provider: str
model: str
|
8.3 Config Parameter
| @dataclass
class IRConfigParam:
name: str
type: IRType
default: IRExpression | None # Default value expression
description: str | None
required: bool
validation: IRExpression | None # Validation expression (uses `value` binding)
hot_reload: bool # Can change without redeployment
runtime_configurable: bool # Can change via API at runtime
secret: bool # Stored in ScaiVault, not config backend
|
8.4 Constraints
1
2
3
4
5
6
7
8
9
10
11
12
13
14 | @dataclass
class IRConstraints:
never: list[str] # Natural language prohibition strings
always: list[str] # Natural language requirement strings
prefer: list[str] # Natural language preference strings
@dataclass
class IRBlockConstraints:
"""Block-level constraints (can override/extend Core-level)."""
never: list[str] | None
always: list[str] | None
prefer: list[str] | None
inherit: bool # Include Core-level constraints
|
8.5 Triggers
| @dataclass
class IRTrigger:
kind: str # "webhook" | "schedule" | "api"
name: str # Trigger name
target_flow: str # Flow to invoke
config: dict[str, Any] # Kind-specific config:
# webhook: { method: str, path: str, auth: str | None }
# schedule: { cron: str, timezone: str }
# api: { } (exposed via standard Core API)
|
8.6 Event Subscriptions & Emissions
1
2
3
4
5
6
7
8
9
10
11
12 | @dataclass
class IREventSubscription:
event_name: str # Event to subscribe to
source_core: str # core:// URI of the emitting Core
target_flow: str # Flow to invoke when event received
@dataclass
class IREventEmission:
"""Declared event type (from @core_interface)."""
event_name: str
fields: dict[str, IRType] # Event payload schema
|
9. Static Verification
The compiler performs these verifications on the IR before emitting it.
All are compile-time errors — the IR is never produced if verification fails.
9.1 @rigid Determinism
Every @rigid block is verified for determinism:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 | Allowed in @rigid:
✓ Assignments (deterministic expressions)
✓ Arithmetic, string ops, comparisons
✓ Collection operations (.map, .filter, .sum, etc.)
✓ Plugin calls (side effects, but deterministic dispatch)
✓ Memory operations
✓ Reference data reads (reference.*)
✓ Config reads (config.*)
✓ @call to flows that are transitively @rigid-safe
✓ if/else, match (with deterministic conditions)
Forbidden in @rigid:
✗ @flexible blocks
✗ @guarded blocks
✗ LLM calls (direct or indirect)
✗ @call to flows containing @flexible/@guarded
✗ Non-deterministic built-ins (random, etc.)
|
The verifier builds a call graph and checks that @rigid blocks
never transitively reach an LLM call.
9.2 @internal Reachability
Flows marked @internal must NOT be reachable from any trigger:
| For each @internal flow:
1. Check it's not referenced by any trigger's target_flow
2. Check it's not referenced by any @on subscription's target_flow
3. It CAN be referenced by @call from other flows (that's its purpose)
|
9.3 Entity Key Enforcement
For :entity Cores, non-@internal flows must include the entity key
as a parameter:
| Core instance = :entity(key = "customer_id")
For each non-@internal flow:
1. Check that params include "customer_id: string"
2. The runtime uses this param for instance routing
For @internal flows:
No entity key required (already running within an instance)
|
9.4 Type Checking
Standard type checking across all expressions:
- Binary operations have compatible operand types
- Function/flow calls match parameter types
- Return values match declared return types
- Memory operations match declared memory schema
- @flexible output types are valid object types
- Enum values are within declared variants
9.5 @checkpoint Placement
1
2
3
4
5
6
7
8
9
10
11
12
13 | @checkpoint is FORBIDDEN inside:
- @parallel branches
- @foreach body (ALLOWED — special handling)
Wait — @foreach is allowed per Checkpoint Serialization spec §8.2.
@checkpoint is FORBIDDEN inside:
- @parallel branches (compile error E030)
@checkpoint is ALLOWED inside:
- @foreach (iterator state is captured)
- @try body (checkpoint suspend propagates through try)
- Nested @call targets (full frame stack is serialized)
|
9.6 Emit Validation
For each emit statement, if the Core has a @core_interface with
event declarations, the emitted event's fields are checked against
the declared event type.
9.7 Custom Verification Rules (Extensible)
Platforms hosting ScaiCore programs (e.g., FidoBoard) may require
domain-specific verification beyond the built-in rules. The Verifier
supports custom rules via a plugin interface:
1
2
3
4
5
6
7
8
9
10
11
12 | class CustomVerifierRule(Protocol):
"""A domain-specific verification rule applied after built-in checks."""
name: str # e.g., "fidoboard.settlement_path"
description: str # Human-readable explanation
def check(self, module: IRModule) -> list[VerificationError]:
"""
Inspect the IR and return any violations.
Called after all built-in verification passes.
"""
...
|
Custom rules are registered in the compiler configuration (not in the
source file). They run after all standard verification passes and can
inspect the full IRModule.
Example use cases:
- "Every flow path must call
platform.submit_to_escrow() or
platform.record_event(type='session_failed')" (settlement path)
- "No flow may call more than 3 external plugins" (cost control)
- "Every @checkpoint must have
on_timeout set" (session safety)
Custom rules produce the same VerificationError format as built-in
rules (error code, location, message). The compiler CLI supports
loading custom rules via --rules <path>.
10. IR Serialization
A compiled Core is shipped as a .scaicore-ir file — a MessagePack-encoded
IRModule:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23 | class IRSerializer:
"""Serializes/deserializes IRModule to/from bytes."""
MAGIC = b"SCIR" # Magic bytes for format detection
FORMAT_VERSION = 1
def serialize(self, module: IRModule) -> bytes:
"""
Serialize an IRModule to a deployable bundle.
Format: MAGIC (4B) + VERSION (4B) + MSGPACK payload
"""
payload = msgpack.pack(self.to_dict(module))
return self.MAGIC + struct.pack(">I", self.FORMAT_VERSION) + payload
def deserialize(self, data: bytes) -> IRModule:
"""
Deserialize a .scaicore-ir bundle.
Validates magic bytes and format version.
"""
assert data[:4] == self.MAGIC, "Not a ScaiCore IR bundle"
version = struct.unpack(">I", data[4:8])[0]
assert version <= self.FORMAT_VERSION, f"IR version {version} not supported"
return self.from_dict(msgpack.unpack(data[8:]))
|
10.2 Bundle Contents
The .scaicore-ir bundle contains:
| bundle/
├── module.msgpack # The IRModule (this spec)
├── manifest.json # CoreManifest (human-readable, for Host tooling)
├── source_map.msgpack # Source locations → IR locations (for debugging)
└── reference/ # Reference data files (if bundled)
├── faq.json
├── catalog.json
└── ...
|
10.3 Source Map
| @dataclass
class SourceLocation:
file: str # Source file path
line: int # Line number (1-indexed)
column: int # Column number (1-indexed)
end_line: int | None = None
end_column: int | None = None
|
Every IR node carries a source_location so that runtime errors can
point back to the original source code.
11. Runtime Execution Contract
This section defines how the runtime uses the IR. This is the contract
the runtime must uphold.
11.1 Module Loading
1
2
3
4
5
6
7
8
9
10
11
12 | class IRLoader:
def load(self, bundle_path: str) -> IRModule:
"""
Load a .scaicore-ir bundle and prepare it for execution.
Steps:
1. Deserialize the IRModule from the bundle
2. Validate schema_version compatibility
3. Resolve plugin interfaces against available plugins
4. Return the loaded module
"""
...
|
11.2 Flow Execution
The runtime executes a flow by walking its body: list[IRBlock]:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24 | async def execute_flow(self, flow: IRFlow, args: dict[str, Any]) -> Any:
"""
Execute a compiled flow.
1. Create a new Scope
2. Bind arguments to parameters
3. Walk flow.body sequentially
4. For each IRBlock, dispatch to the appropriate handler
5. Return the flow's return value
"""
scope = Scope()
for param, value in zip(flow.params, args.values()):
scope.set(param.name, value)
result = None
for block_index, block in enumerate(flow.body):
result = await self.execute_block(block, scope, block_index)
if isinstance(result, ReturnSignal):
return result.value
if isinstance(result, SuspendSignal):
# Checkpoint or @await_responses hit
return result # Propagate suspension up
return result
|
11.3 Block Dispatch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 | async def execute_block(self, block: IRBlock, scope: Scope, index: int) -> Any:
"""Dispatch to the appropriate block handler."""
handlers = {
"rigid": self.exec_rigid,
"flexible": self.exec_flexible,
"guarded": self.exec_guarded,
"parallel": self.exec_parallel,
"foreach": self.exec_foreach,
"match": self.exec_match,
"while": self.exec_while,
"checkpoint": self.exec_checkpoint,
"core_call": self.exec_core_call,
"await_responses": self.exec_await_responses,
"try_catch": self.exec_try_catch,
"budget": self.exec_budget,
"debug": self.exec_debug,
"model_call": self.exec_model_call,
}
result = await handlers[block.kind](block, scope)
# Assign result to binding if present
if block.result_binding and result is not None:
scope.set(block.result_binding, result)
return result
|
11.4 Expression Evaluation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28 | async def evaluate(self, expr: IRExpression, scope: Scope) -> Any:
"""Evaluate an expression in the given scope."""
match expr.kind:
case "literal":
return expr.value
case "variable":
return scope.get(expr.name)
case "member":
obj = await self.evaluate(expr.object_expr, scope)
return getattr_or_index(obj, expr.field)
case "binary":
left = await self.evaluate(expr.left, scope)
right = await self.evaluate(expr.right, scope)
return apply_binary_op(expr.op, left, right)
case "call":
args = [await self.evaluate(a, scope) for a in expr.args]
return await self.call_flow(expr.callee, args)
case "reference":
return self.reference_data[expr.field]
case "config":
return self.resolved_config[expr.field]
case "null_coalesce":
left = await self.evaluate(expr.left, scope)
if left is not None:
return left
return await self.evaluate(expr.right, scope)
# ... etc for all expression kinds
|
12. Compiler CLI Interface
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 | scaicore compile <source_dir>
Compile .scaicore files into a .scaicore-ir bundle.
Options:
--output, -o <path> Output bundle path (default: ./build/<core_name>.scaicore-ir)
--emit-ir Print human-readable IR to stdout (for debugging)
--emit-manifest Print CoreManifest JSON to stdout
--check Type-check and verify only (don't emit bundle)
--debug Include @debug blocks in output
--source-map Include source map in bundle (default: true)
--strict Treat warnings as errors
Exit codes:
0 — Success
1 — Compilation error (syntax, type, or verification error)
2 — File system error (can't read source, can't write output)
|
Example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26 | $ scaicore compile ./invoice-processor/
Compiling InvoiceProcessor v1.0.0...
✓ Parsed 3 files (42 flows, 8 transformers, 3 evaluators)
✓ Resolved all imports and types
✓ Type-checked 53 callable units
✓ Verified 18 @rigid blocks for determinism
✓ Verified entity key enforcement (3 entry-point flows)
✓ Verified 2 @checkpoint placements
→ build/invoice-processor.scaicore-ir (128 KB)
$ scaicore compile --emit-ir ./invoice-processor/ | head -20
IRModule: InvoiceProcessor v1.0.0
instance_mode: stateless
plugins: [scaidrive, scaisend, ocr, erp]
models: [primary (scailabs/poolnoodle-omni), fast (scailabs/poolnoodle-mini)]
memory: {vendor_aliases: map[string, string], ...}
reference: {gl_code_catalog: map[string, GLCodeEntry], ...}
triggers: [email_invoice → process_invoice, ...]
flow process_invoice(raw: RawInvoice): ProcessingResult
[0] rigid: file_content = scaidrive.download(raw.file_id)
[1] rigid: ocr_result = ocr.extract_text(file_content)
[2] flexible: goal="Extract structured invoice data" → ExtractedInvoice
[3] call: validate_invoice(extracted, true)
[4] rigid: vendor_profile = memory.vendor_aliases.get(...)
...
|
13. Summary of Design Decisions
| Decision |
Choice |
Rationale |
| IR level |
High-level tree (not bytecode) |
LLM calls dominate execution time; tree-walking is adequate |
| Serialization |
MessagePack |
Compact, fast, same format as checkpoints |
| Type resolution |
Fully resolved at compile time |
No runtime type lookups, better error messages |
| Block representation |
Discriminated union (kind field) |
Simple dispatch, extensible for future block types |
| Source locations |
On every node |
Runtime errors point back to source |
| @debug blocks |
Compiled out by default |
Zero production overhead |
| Expression evaluation |
Tree-walking |
Simple, debuggable, sufficient performance |
| Plugin interfaces |
Optionally resolved at compile time |
Enables type checking against plugin methods |
| Bundle format |
Self-contained file + manifest |
Deployable unit, manifest readable without loading IR |
| IR version |
schema_version field in IRModule |
Forward compatibility as IR evolves |