---
summary: "List voices, render audio in your shell, save it to a file \u2014 five minutes\
  \ end-to-end."
title: Quickstart
path: quickstart
status: published
---

In five minutes you'll list available voices, render audio in a global voice, and save the result to disk.

You need:

- A ScaiGrid API key with `scaispeak:voice.read` and `scaispeak:synthesize` (any tenant admin has both).
- An audio player on your machine (`afplay`, `play`, `ffplay`, anything that plays MP3).

```bash
export SCAIGRID_HOST="https://scaigrid.scailabs.ai"
export SCAIGRID_API_KEY="sgk_..."
```

## 1. List voices

Every tenant sees the platform's global voices automatically. Pick one with `embedding_status: ready`.

```bash
curl "$SCAIGRID_HOST/v1/modules/scaispeak/voices?language=en&embedding_status=ready&limit=5" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY"
```

```python
import httpx, os
voices = httpx.get(
    f"{os.environ['SCAIGRID_HOST']}/v1/modules/scaispeak/voices",
    headers={"Authorization": f"Bearer {os.environ['SCAIGRID_API_KEY']}"},
    params={"language": "en", "embedding_status": "ready", "limit": 5},
).json()["data"]["items"]
for v in voices:
    print(v["voice_id"], v["display_name"], v["scope"])
```

```javascript
const res = await fetch(
  `${process.env.SCAIGRID_HOST}/v1/modules/scaispeak/voices?language=en&embedding_status=ready&limit=5`,
  { headers: { "Authorization": `Bearer ${process.env.SCAIGRID_API_KEY}` } },
);
const { data } = await res.json();
data.items.forEach(v => console.log(v.voice_id, v.display_name, v.scope));
```

Save one `voice_id` — you'll need it for the next call.

## 2. Render a preview

Every voice has a built-in preview endpoint that renders a short sample. Cheap, capped at 300 chars, useful for picking a voice from the library.

```bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaispeak/voices/$VOICE_ID/preview" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -F "text=Hello from ScaiSpeak. This is the preview." \
  -F "response_format=mp3" \
  | python -c "import sys,json,base64;\
b=json.load(sys.stdin)['data']['audio_base64'];\
open('preview.mp3','wb').write(base64.b64decode(b))"
```

Play `preview.mp3`. If it sounds wrong, pick a different `voice_id` and repeat.

## 3. Synthesise full text

`POST /speak` is the production verb. Short text returns inline; longer text falls through to an async job (default threshold is 500 characters).

```bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaispeak/speak" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice_id": "'$VOICE_ID'",
    "text": "Speech synthesis in ScaiSpeak is metered, routed, and recorded the same way any other inference call is.",
    "response_format": "mp3"
  }'
```

```python
import httpx, os, base64

resp = httpx.post(
    f"{os.environ['SCAIGRID_HOST']}/v1/modules/scaispeak/speak",
    headers={"Authorization": f"Bearer {os.environ['SCAIGRID_API_KEY']}"},
    json={
        "voice_id": os.environ["VOICE_ID"],
        "text": "Speech synthesis in ScaiSpeak is metered, routed, and recorded.",
        "response_format": "mp3",
    },
).json()["data"]
audio = base64.b64decode(resp["audio_base64"])
open("synth.mp3", "wb").write(audio)
print(resp["backend_used"], resp["char_count"], "chars")
```

```javascript
const out = await fetch(`${process.env.SCAIGRID_HOST}/v1/modules/scaispeak/speak`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.SCAIGRID_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ voice_id: process.env.VOICE_ID, text: "...", response_format: "mp3" }),
});
const { data } = await out.json();
require("fs").writeFileSync("synth.mp3", Buffer.from(data.audio_base64, "base64"));
console.log(data.backend_used, data.char_count);
```

You should see `backend_used: "A"` if your tenant has a self-hosted TTS node online, `"B"` if you're routed to the managed TTS relay.

## 4. Long-form (async path)

Text longer than the threshold (or `force_async: true`) returns `202 Accepted` with a `job_id`. Poll `GET /speak/jobs/{job_id}` until `status: completed`; the response carries `audio_base64` inline (small outputs) or an S3 URI (larger ones).

## 5. Save directly to ScaiDrive

If you're authenticated with a JWT (not an `sgk_` API key), the synth output can land straight in a ScaiDrive share with `save_to`:

```bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaispeak/speak" \
  -H "Authorization: Bearer $JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "voice_id": "'$VOICE_ID'",
    "text": "Audio that lands in your share with no second round-trip.",
    "save_to": { "share_id": "shr_xyz", "filename": "chapter-01.mp3" },
    "inline_response": false
  }'
```

The response carries the new `file_id`, `name`, and `version_id`.

## What just happened

- `/voices` returned the visible voice library — global plus your tenant's plus your user's.
- `/voices/{id}/preview` rendered a short clip through the same dispatcher the production endpoint uses.
- `/speak` picked a backend (self-hosted A or relay B) per tenant policy, dispatched the synth, and either streamed the audio inline or queued a job.
- Every call was metered by ScaiGrid's accounting pipeline against your tenant's budget.

## Next

- Clone a voice from your own recording — see [clone and synthesise](./tutorials/clone-and-synthesise).
- Wire low-latency streaming for an interactive product — see [stream with WebSocket](./tutorials/stream-with-websocket).
- Configure your tenant's backend policy — see [Architecture](./concepts/architecture).
