Quickstart

In five minutes you'll have a published persona answering questions through ScaiGrid's standard /v1/inference/chat endpoint, with RAG retrieval from a ScaiMatrix collection.

You need:

A ScaiGrid API key with scaipersona:manage (any tenant admin has this).
A ScaiMatrix collection id you can read from (or skip step 2 if you just want a no-RAG persona).
The slug of an existing frontend model to wrap (for example scailabs/poolnoodle-omni).

bash
export SCAIGRID_HOST="https://scaigrid.scailabs.ai"
export SCAIGRID_API_KEY="sgk_..."

1. Create the persona#

bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Quickstart Persona",
    "slug": "quickstart",
    "model_slug": "scailabs/poolnoodle-omni",
    "system_prompt": "You are a concise product assistant. Answer only from the provided context.",
    "rag_enabled": true,
    "rag_strategy": "single_step",
    "rag_top_k": 5
  }'

python
import httpx, os
persona = httpx.post(
    f"{os.environ['SCAIGRID_HOST']}/v1/modules/scaipersona/personas",
    headers={"Authorization": f"Bearer {os.environ['SCAIGRID_API_KEY']}"},
    json={
        "name": "Quickstart Persona",
        "slug": "quickstart",
        "model_slug": "scailabs/poolnoodle-omni",
        "system_prompt": "You are a concise product assistant. Answer only from the provided context.",
        "rag_enabled": True,
        "rag_strategy": "single_step",
        "rag_top_k": 5,
    },
).json()["data"]
print(persona["id"])

javascript
const res = await fetch(`${process.env.SCAIGRID_HOST}/v1/modules/scaipersona/personas`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.SCAIGRID_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    name: "Quickstart Persona",
    slug: "quickstart",
    model_slug: "scailabs/poolnoodle-omni",
    system_prompt: "You are a concise product assistant. Answer only from the provided context.",
    rag_enabled: true,
    rag_strategy: "single_step",
    rag_top_k: 5,
  }),
});
const { data: persona } = await res.json();
console.log(persona.id);

Save the returned persona.id. The response also returns the generated slug — by default a lowercased, hyphenated version of the name if you didn't pass one.

2. Attach a knowledge source#

bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/sources" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source_type": "collection",
    "source_id": "col_handbook",
    "source_name": "Product Handbook",
    "weight": 1.0
  }'

python
src = httpx.post(
    f"{os.environ['SCAIGRID_HOST']}/v1/modules/scaipersona/personas/{persona['id']}/sources",
    headers={"Authorization": f"Bearer {os.environ['SCAIGRID_API_KEY']}"},
    json={
        "source_type": "collection",
        "source_id": "col_handbook",
        "source_name": "Product Handbook",
        "weight": 1.0,
    },
).json()["data"]

javascript
const src = await fetch(
  `${process.env.SCAIGRID_HOST}/v1/modules/scaipersona/personas/${persona.id}/sources`,
  {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.SCAIGRID_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      source_type: "collection",
      source_id: "col_handbook",
      source_name: "Product Handbook",
      weight: 1.0,
    }),
  },
).then((r) => r.json());

For a ScaiDrive share, use "source_type": "scaidrive" and pass the share id as source_id.

3. Publish the persona#

bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/publish" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{}'

The response carries published: true, published_model_id, and the persona's tenant-qualified slug. The published frontend model's slug is tenant/{tenant_slug}/{persona_slug} — note that down, it's how callers will target the persona.

4. Invoke the persona#

A published persona is just a frontend model. Call it through the standard inference endpoint:

bash
curl -X POST "$SCAIGRID_HOST/v1/inference/chat" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tenant/acme/quickstart",
    "messages": [
      {"role": "user", "content": "What are our refund terms?"}
    ]
  }'

python
resp = httpx.post(
    f"{os.environ['SCAIGRID_HOST']}/v1/inference/chat",
    headers={"Authorization": f"Bearer {os.environ['SCAIGRID_API_KEY']}"},
    json={
        "model": "tenant/acme/quickstart",
        "messages": [{"role": "user", "content": "What are our refund terms?"}],
    },
).json()["data"]
print(resp["content"])

javascript
const reply = await fetch(`${process.env.SCAIGRID_HOST}/v1/inference/chat`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.SCAIGRID_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "tenant/acme/quickstart",
    messages: [{ role: "user", content: "What are our refund terms?" }],
  }),
}).then((r) => r.json());
console.log(reply.data.content);

Replace acme with your tenant slug. The response is grounded in the collection you attached; the persona's system prompt sets the voice.

5. Iterate#

Update the persona's prompt or RAG settings at any time:

bash
curl -X PUT "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "system_prompt": "You are a friendly product assistant. Cite the section name when you quote." }'

If the persona is already published, the published frontend model is synced automatically — the very next call uses the new prompt.

What just happened#

The persona is a row in ScaiGrid's database with config, system prompt, and RAG settings.
Publishing created a FrontendModel row that mirrors the underlying model's capabilities and backend links, with the persona's system prompt baked in.
The inference call hit the standard pipeline. ScaiPersona's request enricher noticed the persona id on the frontend model's metadata, retrieved chunks from the attached collection, and injected them into the system prompt before dispatch.
The whole turn was metered against your tenant budget the same as any other model call.

Next#

Read Architecture for the request flow in detail.
See RAG strategies to choose between single_step, multi_step, and agentic.
Walk through Build a knowledge-grounded persona for the full multi-source setup.