Embed the runtime
The scaicore CLI is a thin convenience over the same runtime your host will embed. When you outgrow the CLI — because you need to plug in your own model provider, persist memory in a real database, handle checkpoints through a queue, or invoke flows from inside a larger Python service — you embed CoreEngine directly. This tutorial walks the embed path end-to-end with the human-approval flow as the example.
It assumes you've installed the scaicore package and have a compiled bundle on disk (scaicore compile greet.scaicore -o build/hello.scaicore-ir).
1. Load the bundle#
A compiled .scaicore-ir file is a MessagePack blob with a magic-bytes header. The serializer roundtrips it to an IRModule:
1 2 3 4 5 6 7 | |
The runtime treats IRModule as the immutable contract between compiler and executor. You don't construct it by hand; you load it.
2. Build the host environment#
HostEnvironment is the dependency-injection boundary. Anything the runtime needs from outside — memory persistence, model providers, plugins, a checkpoint store, an event sink, a clock — lives on this object. The in-memory implementations shipped with the package are enough for a first run:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
Production hosts swap each of these for real implementations: a Redis-backed memory store, a routing model provider that fans out to Anthropic/OpenAI/etc, a queue-backed checkpoint store, a Kafka event sink.
3. Load the engine#
CoreEngine.load(module, environment) validates the module against the environment (e.g., required plugins are available) and returns an engine ready to invoke. Auto-telemetry is on by default:
1 2 3 | |
Pass auto_telemetry=False if you don't want the runtime firing invocation.* and block.* lifecycle events through your event sink. The default is True when an event sink is wired (see the changelog v1.2.0).
4. Invoke a flow#
InvocationRequest carries the flow name, input dict, optional identity, and trigger context:
1 2 3 4 5 6 7 8 | |
engine.invoke is sync — it drives the executor's async API through asyncio.get_event_loop().run_until_complete internally. If your host is already running an event loop, use engine.ainvoke (same signature, returns a coroutine).
5. Branch on the status#
The three terminal states need three different handlers:
1 2 3 4 5 6 7 8 | |
For the human-approval flow, the @checkpoint causes SUSPENDED. The result.checkpoint.checkpoint_id is a stable identifier the host stores and routes — typically into a queue or a database table — alongside whatever UI the human will use to decide.
6. Resume#
When the human's decision is back, the host calls engine.resume with the same checkpoint id and a resolution dict. The runtime restores the flow's scope from the checkpoint, runs whatever on_response arms the @checkpoint declared (or binds the resolution directly), and continues from the block immediately after the @checkpoint:
1 2 3 4 5 6 7 8 9 | |
The resolution dict keys are whatever the flow expects. For the human-approval tutorial the key is "decision"; the runtime extracts it for the @match block. Custom on_response shapes use the full dict.
What you have#
A host that loads a compiled Core, exposes a typed entry point per flow, persists checkpoints into a backend you control, and can resume any suspended flow by id. The same code shape powers ScaiFlow and ScaiGrid — the in-memory implementations swap out for production-grade ones, the API stays identical.
Next moves a real host typically makes:
- Replace
MockModelProviderwith a routing provider that reads each flow's@llmrole and dispatches accordingly. - Replace
InMemoryCheckpointBackendwith a database-backed implementation that survives process restarts. - Subscribe to the auto-telemetry events on the event sink for tracing and dashboards. See the changelog v1.2.0 for the event names and payload shapes.
- Implement a
CoreDispatcherif your Cores call other Cores.