Platform
ScaiWave ScaiGrid ScaiCore ScaiBot ScaiDrive ScaiKey Models Tools & Services
Solutions
Organisations Developers Internet Service Providers Managed Service Providers AI-in-a-Box
Resources
Support Documentation Blog Downloads
Company
About Research Careers Investment Opportunities Contact
Log in

Changelog

User-visible changes only. Internal refactors and infrastructure work omitted.

v1.0 — Launch#

First generally-available release.

  • REST submission for seven training types: SFT, LORA, QLORA, DPO, RLHF, CONTINUED_PRETRAIN, FULL_FINETUNE.
  • Five framework targets: HF_TRAINER, DEEPSPEED, FSDP, MEGATRON, CUSTOM.
  • Full lifecycle: submit, cancel, pause, resume, retry (with optional resource modification).
  • Point-in-time job metrics plus Server-Sent Event streams for live logs and metric snapshots.
  • Cluster operations: status, node listing, node drain (force-able) and enable, queue depth.
  • Evaluation submission and retrieval, with multiple named benchmarks per run.
  • Data operations: pre-flight validation of training sources, coordinator-side dataset cache inspection.
  • Local job-state cache (mod_scaimind_jobs) auto-synced on list calls for fast dashboard reads.
  • Downstream token forwarding for ScaiDrive and ScaiAtlas, scoped per request.
  • gRPC error sanitisation — INTERNAL and UNKNOWN coordinator details are replaced with friendly messages and captured in structured logs.
  • Admin UI: Training Dashboard, Job Creator, Training Monitor, Evaluation Center, Hardware Monitor.
Updated 2026-05-18 15:01:31 View source (.md) rev 12