Changelog
User-visible changes only. Internal refactors and infrastructure work omitted.
v0.0.1 — Initial rollout#
The current phase ships the speech-to-text surface end-to-end. Module identity, permissions, admin pages, and the lifecycle hooks are all wired; the API mounts at /v1/modules/scaiecho/.
- Batch transcription via
POST /transcribe. Inline for short audio, async via the arq worker pool for long audio withGET /transcribe/jobs/{id}for polling. - Real-time WebSocket streaming at
WS /stream/transcribe. Bearer auth from query or header, binary audio in, JSON delta frames out. - WebRTC streaming under
/stream/transcribe/webrtc/. Session create, SDP offer/answer, ICE trickle, control WebSocket for deltas, session teardown. - Speaker library — list, get, enroll, warm-inspect, proactive warm, update, delete. Enrollment captures reference and consent audio with quality preflight and fan-out to every online pyannote node.
- Optional speaker diarization on streaming routes via
diarize=true. Silently no-op on Backend B. - Per-tenant backend policy at
/tenant-policy. Lazy-provisioned from tier defaults. - MCP tool
scaiecho.transcriberegistered in the catalog. - Speaker erasure with immutable audit row and replica eviction.