---
title: gRPC API
path: advanced/grpc-api
status: published
---

# gRPC API

ScaiGrid exposes a gRPC API alongside its HTTP surface. It's used for internal integrations — ScaiInfer nodes report heartbeats, ScaiMind coordinates training jobs, high-throughput streaming scenarios — and is available to operators who need tighter integration than HTTP.

**Default port:** `50051`
**Protocol:** gRPC over HTTP/2, protobuf serialization

## When to use gRPC

- **High-throughput streaming.** Multi-GB inference outputs, long-lived connections, low per-call overhead.
- **Internal ScaiLabs components.** ScaiInfer nodes, ScaiMind cluster nodes, worker agents — all connect via gRPC.
- **Low-latency RPC.** Binary protobuf and persistent connections beat HTTP for hot paths.

For typical application integration, **use the HTTP API**. gRPC adds setup overhead that isn't justified for most use cases.

## Services exposed

### ScaiInfer bridge (`scaiinfer.v1`)

Inference nodes register and stream to the gateway via gRPC:

- `InferenceService.StreamInference` — low-latency streaming LLM calls
- `InferenceService.ListModels` — query a node for its loaded models
- `HeartbeatService.Heartbeat` — node health and capacity reports

Used by the `scaiinfer` backend dispatcher. Not typically called directly by application code.

### ScaiMind coordination (`scaimind.v1`)

MindCoordinator → gateway, and gateway → cluster nodes:

- `CoordinatorService.SubmitJob` — create a training job
- `CoordinatorService.StreamMetrics` — stream training metrics in real time
- `ClusterService.RegisterNode` — node registration
- `ClusterService.AllocateGPUs` — GPU scheduling

Covered via REST at `/v1/modules/scaimind/*`. The REST layer is a thin bridge — the authoritative API is the gRPC one.

### ScaiCore runtime (`scaicore.v1`)

Core runtime internals:

- `CoreRuntimeService.InvokeCore`, `PassivateCore`, `RestoreCore` — instance lifecycle
- `CoreRuntimeService.StreamEvents` — real-time event stream from a running core

Exposed for observability tools that need richer-than-REST event feeds.

## Authentication

gRPC auth uses a service token passed as metadata:

```
authorization: Bearer <service-token>
```

Service tokens are issued by ScaiKey and scoped to specific services. They're not interchangeable with ScaiGrid API keys.

Inside a cluster, mTLS is the recommended additional layer. See [Deployment](../08-operations/01-deployment.md) for mTLS setup.

## Proto files

The proto definitions live in the ScaiGrid source tree at `proto/`. For external integrations, we publish generated stubs for common languages at a dedicated package registry (ask your ScaiGrid support contact for access).

Service signatures are versioned — `scaiinfer.v1`, `scaimind.v1`. Backwards-incompatible changes introduce a new version (`v2`); `v1` stays supported until all clients migrate.

## Example: a minimal client

```python
import grpc
from scaiinfer.v1 import scaiinfer_pb2 as pb
from scaiinfer.v1.scaiinfer_pb2_grpc import InferenceServiceStub

channel = grpc.aio.insecure_channel("scaigrid.scailabs.ai:50051")
stub = InferenceServiceStub(channel)

metadata = [
    ("authorization", f"Bearer {SERVICE_TOKEN}"),
    ("x-request-id", "req_abc"),
]

req = pb.InferenceRequest(model="scailabs/poolnoodle-omni", ...)
async for chunk in stub.StreamInference(req, metadata=metadata):
    print(chunk.delta.content, end="")
```

For production, use TLS (`grpc.secure_channel`) and proper error handling.

## Standard gRPC errors

ScaiGrid maps its error codes to standard gRPC status codes:

| gRPC Status | ScaiGrid codes |
|-------------|----------------|
| `UNAUTHENTICATED` | Auth errors |
| `PERMISSION_DENIED` | Permission/budget errors |
| `NOT_FOUND` | 404 codes |
| `INVALID_ARGUMENT` | Validation errors |
| `RESOURCE_EXHAUSTED` | Rate-limited / quota-exhausted |
| `UNAVAILABLE` | Backend unavailable, service draining |
| `DEADLINE_EXCEEDED` | Timeout |
| `INTERNAL` | Unexpected server error |

The original ScaiGrid error code is attached as metadata (`scaigrid-error-code`) and message for programmatic handling.

## Not everything is on gRPC

Not every REST endpoint has a gRPC equivalent. gRPC coverage is focused on:

- Inference (chat, embeddings, streaming)
- ScaiInfer / ScaiMind / ScaiCore internal coordination
- Event streaming

Admin operations, accounting queries, user management — these stay on HTTP.

## Related

- [MCP Server](./02-mcp-server.md) — another binary-transport protocol for agent integration
- [Inference Reference](../06-reference/05-inference.md) — HTTP equivalents
- Internal proto specs in the ScaiGrid source tree