Platform
ScaiWave ScaiGrid ScaiCore ScaiBot ScaiDrive ScaiKey Models Tools & Services
Solutions
Organisations Developers Internet Service Providers Managed Service Providers AI-in-a-Box
Resources
Support Documentation Blog Downloads
Company
About Research Careers Investment Opportunities Contact
Log in

gRPC API

ScaiGrid exposes a gRPC API alongside its HTTP surface. It's used for internal integrations — ScaiInfer nodes report heartbeats, ScaiMind coordinates training jobs, high-throughput streaming scenarios — and is available to operators who need tighter integration than HTTP.

Default port: 50051 Protocol: gRPC over HTTP/2, protobuf serialization

When to use gRPC#

  • High-throughput streaming. Multi-GB inference outputs, long-lived connections, low per-call overhead.
  • Internal ScaiLabs components. ScaiInfer nodes, ScaiMind cluster nodes, worker agents — all connect via gRPC.
  • Low-latency RPC. Binary protobuf and persistent connections beat HTTP for hot paths.

For typical application integration, use the HTTP API. gRPC adds setup overhead that isn't justified for most use cases.

Services exposed#

ScaiInfer bridge (scaiinfer.v1)#

Inference nodes register and stream to the gateway via gRPC:

  • InferenceService.StreamInference — low-latency streaming LLM calls
  • InferenceService.ListModels — query a node for its loaded models
  • HeartbeatService.Heartbeat — node health and capacity reports

Used by the scaiinfer backend dispatcher. Not typically called directly by application code.

ScaiMind coordination (scaimind.v1)#

MindCoordinator → gateway, and gateway → cluster nodes:

  • CoordinatorService.SubmitJob — create a training job
  • CoordinatorService.StreamMetrics — stream training metrics in real time
  • ClusterService.RegisterNode — node registration
  • ClusterService.AllocateGPUs — GPU scheduling

Covered via REST at /v1/modules/scaimind/*. The REST layer is a thin bridge — the authoritative API is the gRPC one.

ScaiCore runtime (scaicore.v1)#

Core runtime internals:

  • CoreRuntimeService.InvokeCore, PassivateCore, RestoreCore — instance lifecycle
  • CoreRuntimeService.StreamEvents — real-time event stream from a running core

Exposed for observability tools that need richer-than-REST event feeds.

Authentication#

gRPC auth uses a service token passed as metadata:

text
1
authorization: Bearer <service-token>

Service tokens are issued by ScaiKey and scoped to specific services. They're not interchangeable with ScaiGrid API keys.

Inside a cluster, mTLS is the recommended additional layer. See Deployment for mTLS setup.

Proto files#

The proto definitions live in the ScaiGrid source tree at proto/. For external integrations, we publish generated stubs for common languages at a dedicated package registry (ask your ScaiGrid support contact for access).

Service signatures are versioned — scaiinfer.v1, scaimind.v1. Backwards-incompatible changes introduce a new version (v2); v1 stays supported until all clients migrate.

Example: a minimal client#

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import grpc
from scaiinfer.v1 import scaiinfer_pb2 as pb
from scaiinfer.v1.scaiinfer_pb2_grpc import InferenceServiceStub

channel = grpc.aio.insecure_channel("scaigrid.scailabs.ai:50051")
stub = InferenceServiceStub(channel)

metadata = [
    ("authorization", f"Bearer {SERVICE_TOKEN}"),
    ("x-request-id", "req_abc"),
]

req = pb.InferenceRequest(model="scailabs/poolnoodle-omni", ...)
async for chunk in stub.StreamInference(req, metadata=metadata):
    print(chunk.delta.content, end="")

For production, use TLS (grpc.secure_channel) and proper error handling.

Standard gRPC errors#

ScaiGrid maps its error codes to standard gRPC status codes:

gRPC Status ScaiGrid codes
UNAUTHENTICATED Auth errors
PERMISSION_DENIED Permission/budget errors
NOT_FOUND 404 codes
INVALID_ARGUMENT Validation errors
RESOURCE_EXHAUSTED Rate-limited / quota-exhausted
UNAVAILABLE Backend unavailable, service draining
DEADLINE_EXCEEDED Timeout
INTERNAL Unexpected server error

The original ScaiGrid error code is attached as metadata (scaigrid-error-code) and message for programmatic handling.

Not everything is on gRPC#

Not every REST endpoint has a gRPC equivalent. gRPC coverage is focused on:

  • Inference (chat, embeddings, streaming)
  • ScaiInfer / ScaiMind / ScaiCore internal coordination
  • Event streaming

Admin operations, accounting queries, user management — these stay on HTTP.

  • MCP Server — another binary-transport protocol for agent integration
  • Inference Reference — HTTP equivalents
  • Internal proto specs in the ScaiGrid source tree
Updated 2026-05-18 15:01:28 View source (.md) rev 17