Platform
ScaiWave ScaiGrid ScaiCore ScaiBot ScaiDrive ScaiKey Models Tools & Services
Solutions
Organisations Developers Internet Service Providers Managed Service Providers AI-in-a-Box
Resources
Support Documentation Blog Downloads
Company
About Research Careers Investment Opportunities Contact
Log in

Architecture

ScaiQueue is a thin product layer over MariaDB and Redis. Messages are durable rows; the per-queue order and the claim locks live in Redis; system agents enforce visibility timeouts, retries, archival, and dead-lettering on a cron schedule.

Components#

A producer publishes over HTTP. ScaiGrid's MessageService writes the row, updates Redis, and (optionally) invokes the routing engine. A consumer claims via HTTP; on completion ScaiGrid emits an event on its internal event bus. System agents do the housekeeping out-of-band.

flowchart LR P[Producer] C[Consumer] subgraph SG[ScaiGrid FastAPI process] MS[MessageService<br/>- row in MariaDB<br/>- ZADD pending ZSET<br/>- RoutingEngine.eval] CL[POST claim<br/>- ZPOPMIN pending<br/>- SET NX EX claim lock<br/>- UPDATE state=claimed] CO[POST complete / fail /<br/>release / extend] SA[System agents arq cron<br/>- visibility_timeout<br/>- expiry_enforcer<br/>- priority_aging<br/>- dead_letter_monitor<br/>- message_archiver] end P -- POST publish --> MS C -- POST claim --> CL CL --> C C -- complete/fail/release/extend --> CO

There is no separate ScaiQueue daemon. The HTTP endpoints, the routing engine, and the cron-driven system agents all run inside ScaiGrid's existing FastAPI and arq processes.

Lifecycle of one message#

  1. Publish. MessageService.publish validates scope and queue state (paused, archived, or draining reject), writes the message row with state=pending, computes a sort score based on the queue's ordering mode, and ZADDs the id into the per-queue pending sorted set. The routing engine then evaluates message_published-trigger rules; if one matches it may route, transform, escalate, or archive.
  2. Claim. A consumer calls claim. The service ZPOPMINs the lowest-score id, takes a Redis SET NX EX lock keyed by the message id with visibility_timeout_s TTL, then updates the row to state=claimed, sets claimed_by_*, increments attempts. The queue's depth_pending decrements and depth_claimed increments.
  3. Complete or fail. Complete moves the row to completed, releases the Redis lock, decrements depth_claimed, and emits a scaiqueue.message.completed event on ScaiGrid's event bus with the original correlation_id. Fail records failure_reason; if attempts < max_retries the message returns to pending with a backoff, otherwise it is moved to the scope's _dead_letter queue.
  4. Reclaim on crash. If the consumer never calls complete or fail, the claim lock expires. The visibility_timeout_enforcer cron job (runs every second) flips the row back to pending and re-adds it to the pending sorted set.

Ordering modes#

The score in the Redis sorted set depends on the queue's ordering:

  • fifocreated_at as epoch seconds. Lowest is oldest, so ZPOPMIN returns the oldest first.
  • priority-(tier*1000 + score) then created_at. Higher tier-plus-score wins; ties broken by age.
  • deadlinedeadline as epoch seconds. Earliest deadline wins.

The consumer_mode field is independent: competing (default — one consumer claims each message), broadcast (every subscriber sees a copy), or other modes that may exist on system queues.

State#

  • Scopes, queues, messages, routing rules, schemas, HITL patterns, ACLs, API keys, audit log, system-agent registry — all in MariaDB tables prefixed mod_scaiqueue_.
  • Pending order, claim locks, visibility timeouts, subscription dedup sets — Redis keys under sq:{tenant}:{scope}:....
  • Archived messages — moved from mod_scaiqueue_messages to mod_scaiqueue_message_archive by the message_archiver agent for terminal messages older than one hour.

If Redis is unavailable, MessageService falls back to a DB-only claim path that scans pending rows in created_at order. Slower; still correct.

System agents#

These run inside ScaiGrid's arq workers on a cron schedule. You don't start them; they appear in GET /system-agents.

Agent Schedule Role
visibility_timeout_enforcer every second Reclaims expired claims, returns messages to pending.
expiry_enforcer every minute Marks TTL- and deadline-expired pending messages as terminal.
message_archiver every 5 minutes Moves terminal messages older than 1 hour to the archive table.
dead_letter_monitor every minute Emits structured warnings when _dead_letter queues grow.
priority_aging_worker every 5 minutes Bumps priority_score on long-pending messages so low-priority work doesn't starve.

How it differs from rolling your own#

Building this on raw Redis Streams or a hosted broker gives you a topic and a delivery guarantee. ScaiQueue adds the typed payload, the routing engine, the HITL spec, the dead-letter queue, the audit log, and the system agents — all wired into ScaiGrid's auth, accounting, and admin UI. If you only need fire-and-forget pub/sub, you don't need ScaiQueue. If you need durable, retryable, observable work distribution with optional human steps, you do.

Updated 2026-05-18 15:01:32 View source (.md) rev 12