Webhooks reference
Operational details on ScaiControl's outbound webhook dispatcher. For the topic catalog see Events; for the conceptual overview see Concepts: webhooks.
Where the dispatcher lives#
backend/src/scaicontrol/workers/event_dispatcher.py — runs as an arq cron every 30 seconds.
Storage#
Two tables:
webhook_subscriptions— one row per subscriber. Managed via/admin/webhook-subscriptions.event_outbox— durable log of every event AND every delivery attempt. Two row generations live here:- Source rows (
subscription_id IS NULL) — written byemit_event()inside the originating transaction. One per logical event. - Delivery rows (
subscription_idset) — created by the dispatcher when it fans the source row out to matching subscribers. Each carries its own retry state.
- Source rows (
Dispatcher tick#
Each 30-second tick does two passes:
Pass 1 — fan-out. Claim up to 50 pending source rows; for each, look up subscribers whose topics[] glob matches the event_type, and INSERT one delivery row per match. Flip the source row to dispatched. The (subscription_id, idempotency_key) unique constraint catches double-fan-out on crash recovery.
Pass 2 — deliver. Claim up to 200 pending delivery rows that are ready (next_attempt_at <= now); build the canonical envelope, HMAC-sign, POST to target_url with a 10-second timeout. Update status per the response code:
| Outcome | Status | Next |
|---|---|---|
| 2xx | dispatched |
terminal |
| 409 | dispatched (idempotent ack) |
terminal |
| 4xx (except 409) | dead |
terminal |
| 5xx / timeout / network | pending |
backoff: 1m → 5m → 30m → 2h → 12h → 24h, then dead |
Both passes commit their own transaction. A crash mid-tick is safe — the next tick picks up where it left off.
Backoff schedule#
1 2 | |
After MAX_ATTEMPTS, the row goes dead and stops being retried. Dead rows are kept indefinitely for audit; clear them via a manual SQL purge if storage matters.
Idempotency#
idempotency_key is constructed in domain code and shared across the source row + all delivery rows for the same logical event. Subscribers should use it as their inbox dedup key.
Format: <resource_type>:<resource_id>:<event_type>:<lifecycle_step>. Examples:
subscription:sub_abc:activated:initialsubscription:sub_abc:cancelled:reaperpack_subscription:pak_xyz:activated:initialtenant:tnt_xyz:billing_updated:2026-05-12T15:22:00
Once an idempotency_key has been seen by a subscriber for the same subscription_id, the dispatcher's unique constraint prevents a duplicate insert if the source row is somehow re-fanned-out — so subscribers never see duplicates with the same key from the same logical delivery attempt, even on dispatcher crashes.
Signing#
1 2 | |
raw_body_bytes is the byte-exact body — re-serialising the parsed JSON will not produce the same signature.
The secret resolves in this order:
- If
webhook_subscriptions.secret_vault_pathis set → look up in ScaiVault. - Else use the inline
secretcolumn. - If neither resolves → the delivery row goes
deadwitherror_message="no secret resolved for subscription".
Rotating a secret: edit the subscription via the admin UI (PATCH secret to the new value), confirm the subscriber accepts the new key, then delete the old one in their config. There's no overlap window built in; for zero-downtime rotation, run two subscriptions in parallel during the cutover.
Operational notes#
- No tenant filter. The outbox is platform-wide. Webhook subscribers see events for every tenant in the system; topic filters are the only mechanism for narrowing.
- No rate limiting. If you need to throttle a subscriber, do it on their side (queue inside their inbox handler and ack 200 immediately).
- Inspection.
SELECT * FROM event_outbox WHERE event_type LIKE 'subscription.%' ORDER BY created_at DESC LIMIT 50gives a recent timeline. Delivery rows haveresponse_code+response_body_sample(first 512 chars) for debugging. - Replays. To re-emit a logical event, INSERT a new source row with a fresh
event_idbut the sameidempotency_key— subscribers that have already seen the key will dedup, those that haven't will receive it. Or just trigger the original domain action again if it's idempotent.
Adding a new topic#
- Define the payload shape in code (
services/events/builders.py) and write a JSON Schema indocs/integrations/scaicontrol/events/. - Call
emit_event()at the appropriate domain mutation site. - Add the topic to the Catalog docs page.
- Bump no version — adding topics is backwards-compatible.
- The dispatcher needs no changes — it's topic-agnostic.
Adding a subscriber#
POST /api/v1/admin/webhook-subscriptions:
1 2 3 4 5 6 7 | |
Or via the UI at /admin/webhook-subscriptions. Either way, deliveries start on the next dispatcher tick (within 30 seconds).