---
summary: "The data model behind ScaiCore's human-in-the-loop pauses \u2014 what a\
  \ checkpoint is, who it's assigned to, how it expires, and how it resolves."
title: Checkpoints
path: modules/scaicore/concepts/checkpoints
status: published
---

A checkpoint is a paused execution inside a Core, persisted to MariaDB while it waits for a human (or another system) to make a decision. The IR-level details of *when* a checkpoint is created live in the language — see [/docs/scaicore](https://www.scailabs.ai/docs/scaicore). This page is about what ScaiGrid does with the checkpoint once it exists.

## The row

Every checkpoint is one row in `mod_scaicore_checkpoints`. The fields you care about as a wrapper user:

| Field | Purpose |
|---|---|
| `id` | UUID; also the correlation id used by the ScaiQueue HITL bridge. |
| `core_id`, `tenant_id` | Owning Core and tenant. |
| `execution_id`, `flow_name`, `block_index` | Where in the program execution suspended. |
| `instance_key` | For entity-mode Cores, the entity that suspended. |
| `checkpoint_type` | Kind of decision — `approval`, `choice`, `freeform`, etc. (set by the program). |
| `prompt` | The question shown to the assignee. |
| `options` | Optional JSON list of canonical answer choices. |
| `assignee_raw` | The raw assignee string from the program — e.g. `user:alice@acme`, `group:approvers`, `role:tenant_admin`. |
| `assignee_type` | Parsed type: `user`, `group`, `role`, `delegated_user`, or `unrouted`. |
| `assignee_resolved` | Parsed structure with the type + value. |
| `context` | Arbitrary JSON the program attached. Includes `resolved_skills` when bound. |
| `status` | `pending`, `resolved`, `expired`, `cancelled`. |
| `priority` | `low`, `normal`, `high`, `critical`. |
| `expires_at`, `expiry_action`, `escalation_target` | Lifecycle controls. |
| `notification_sent`, `reminder_count`, `reminder_interval_m` | Notification bookkeeping. |
| `resolution`, `resolved_by`, `resolved_at` | Populated when a human acts. |

The state blob (the captured execution state the engine needs to resume) is stored in S3, keyed by `state_s3_key`. The row is the index; S3 holds the bytes.

## Assignment

The program writes a raw assignee string. The wrapper parses the `<type>:<value>` prefix:

- `user:<email-or-id>` — one human.
- `group:<group-id>` — anyone in the group can act.
- `role:<role>` — anyone holding the role can act.
- (no prefix) — treated as `user:` with the raw value.

If the Core is delegated to a specific user, ScaiCore programs can also produce `delegated_user` checkpoints — they route to whatever human the delegation is currently scoped to.

When no assignee can be resolved (typical for offline-debugging Cores), the type is `unrouted`. Unrouted checkpoints accumulate in the admin UI's checkpoint queue for a tenant admin to claim manually.

## Notifications

If a notifier is configured (email transport, Slack, etc.), the wrapper sends a notification at create time and marks `notification_sent = true`. The 15-minute `send_checkpoint_reminders` cron re-pings still-pending checkpoints whose `reminder_interval_m` is set, incrementing `reminder_count` each time.

Emails are sent to addresses extracted from the resolved assignee — addresses that look like emails are passed straight through; group / role resolution to email lists is the notifier's responsibility.

## Expiry

The 5-minute `expire_checkpoints` cron picks up rows where `expires_at <= now()` and `status = pending`. The `expiry_action` decides what happens:

- `cancel` — the row is marked `cancelled`. The program never resumes from this checkpoint.
- `default_option` — the row is marked `resolved` with `decision: "default"` and `auto_expired: true`. The runtime resumes as if a human had picked the default.
- `escalate` — the row's `assignee_raw` is replaced with `escalation_target`, the notifier is asked to escalate, and the row stays `pending` for the new assignee.

## Resolution

A human (or any caller with `scaicore:checkpoint_resolve`) posts to `/checkpoints/{id}/resolve`:

```json
{
  "decision": "approve",
  "response_data": { "amount_approved": 500 },
  "comment": "Within limits."
}
```

The row's `status` flips to `resolved`, the `resolution` JSON captures the decision + response + comment, and `resolved_by` / `resolved_at` get set. The runtime is responsible for picking up the resolution and resuming the suspended execution.

A checkpoint can also be **cancelled** (`POST /checkpoints/{id}/cancel`) without a decision, or **reassigned** (`POST /checkpoints/{id}/reassign`) to a different assignee — which re-runs the assignment parser and re-fires the notifier.

## Frozen skill versions

If the Core has bound ScaiSkills, the resolved skill set at checkpoint-creation time is frozen into the row's `context.resolved_skills`. On resume, the runtime reads those pinned versions back instead of re-resolving — so the resumed execution uses the exact same skill versions it suspended with, even if a new version has been published or yanked since.

Yanked pinned versions are allowed through with a warning (ScaiSkills ERRATA-v0.2 option 2). Missing pinned skill rows are also tolerated; the warning is logged but the resume proceeds.

## ScaiQueue HITL bridge

Programs can publish a checkpoint as a `hitl_request` message into ScaiQueue (with `hitl_message_id` recorded on the row). When the ScaiQueue message is completed, an in-process `scaiqueue.message.completed` event fires. The wrapper's handler — registered by the module — looks the checkpoint up by `correlation_id == checkpoint_id` and resolves it internally.

The handler is idempotent: a re-delivered completion event finds the checkpoint already resolved and returns cleanly. If the checkpoint was never ours (different `correlation_id`), the handler exits without touching anything.

## Audit history

`GET /checkpoints/{id}/history` returns a simplified event list — created, notification sent, resolved. It is not a full append-only audit log; for that, use ScaiGrid's audit-events pipeline filtered by `module=scaicore`.
