Build a knowledge-grounded persona

You're going from zero to a deployed persona that:

answers from two ScaiMatrix collections with different weights,
has a branded avatar,
is published into a model group your downstream apps can target,
can be unpublished and edited without losing state.

Roughly 20 minutes if you have the collections ready.

1. Decide the persona's shape#

Before any API calls, settle these:

Identity. Name, slug, voice. Will it speak as "Acme Support" or as a generic helper?
Underlying model. Which existing frontend model slug do you wrap? The persona inherits its modality, capabilities, context window, and pricing.
Sources. Which ScaiMatrix collections (or ScaiDrive shares) does it read from? Get their ids ready.
RAG strategy. Start with single_step. Move to multi_step or agentic only after you've measured retrieval quality on representative questions.

2. Create the persona#

bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Acme Support Specialist",
    "slug": "acme-support",
    "model_slug": "scailabs/poolnoodle-omni",
    "system_prompt": "You are Avery, a friendly Acme support specialist. Answer concisely from the provided context. If the context does not cover the question, say so and offer to find someone who can help.",
    "rag_enabled": true,
    "rag_strategy": "single_step",
    "rag_top_k": 6,
    "rag_min_score": 0.4,
    "default_params": { "temperature": 0.2 },
    "status": "draft"
  }'

python
import httpx, os

H = {"Authorization": f"Bearer {os.environ['SCAIGRID_API_KEY']}"}
HOST = os.environ["SCAIGRID_HOST"]

persona = httpx.post(
    f"{HOST}/v1/modules/scaipersona/personas",
    headers=H,
    json={
        "name": "Acme Support Specialist",
        "slug": "acme-support",
        "model_slug": "scailabs/poolnoodle-omni",
        "system_prompt": "You are Avery, a friendly Acme support specialist. Answer concisely from the provided context. If the context does not cover the question, say so and offer to find someone who can help.",
        "rag_enabled": True,
        "rag_strategy": "single_step",
        "rag_top_k": 6,
        "rag_min_score": 0.4,
        "default_params": {"temperature": 0.2},
        "status": "draft",
    },
).json()["data"]
PERSONA_ID = persona["id"]

javascript
const persona = await fetch(`${process.env.SCAIGRID_HOST}/v1/modules/scaipersona/personas`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.SCAIGRID_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    name: "Acme Support Specialist",
    slug: "acme-support",
    model_slug: "scailabs/poolnoodle-omni",
    system_prompt: "You are Avery, a friendly Acme support specialist. Answer concisely from the provided context.",
    rag_enabled: true,
    rag_strategy: "single_step",
    rag_top_k: 6,
    rag_min_score: 0.4,
    default_params: { temperature: 0.2 },
    status: "draft",
  }),
}).then((r) => r.json()).then((j) => j.data);

3. Attach two weighted sources#

The handbook is authoritative; the FAQ collection is supplementary. Weight the handbook higher.

bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/sources" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source_type": "collection",
    "source_id": "col_handbook",
    "source_name": "Acme Handbook",
    "weight": 2.0
  }'

curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/sources" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source_type": "collection",
    "source_id": "col_faq",
    "source_name": "Customer FAQ",
    "weight": 1.0
  }'

python
for body in [
    {"source_type": "collection", "source_id": "col_handbook",
     "source_name": "Acme Handbook", "weight": 2.0},
    {"source_type": "collection", "source_id": "col_faq",
     "source_name": "Customer FAQ", "weight": 1.0},
]:
    httpx.post(
        f"{HOST}/v1/modules/scaipersona/personas/{PERSONA_ID}/sources",
        headers=H, json=body,
    ).raise_for_status()

javascript
for (const body of [
  { source_type: "collection", source_id: "col_handbook", source_name: "Acme Handbook", weight: 2.0 },
  { source_type: "collection", source_id: "col_faq", source_name: "Customer FAQ", weight: 1.0 },
]) {
  await fetch(
    `${process.env.SCAIGRID_HOST}/v1/modules/scaipersona/personas/${persona.id}/sources`,
    { method: "POST", headers: { "Authorization": `Bearer ${process.env.SCAIGRID_API_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify(body) },
  );
}

4. Upload an avatar#

bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/avatar" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -F "file=@avery.png"

Limits: 2 MB max; PNG / JPEG / GIF / WebP / SVG. The avatar URL ends up on the published frontend model, so any model-picker UI shows the persona's face.

5. Publish#

bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/publish" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"group_ids": ["mg_support_bots"]}'

The optional group_ids adds the new frontend model to existing model groups. The caller's role limits which groups they can target — global and cross-tenant groups are ignored unless the caller is a super-admin. Bad group ids are silently skipped; check the model group's membership afterwards to confirm.

6. Invoke through the standard inference endpoint#

The persona is now a frontend model. Callers never touch a ScaiPersona route to use it.

bash
curl -X POST "$SCAIGRID_HOST/v1/inference/chat" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tenant/acme/acme-support",
    "messages": [
      {"role": "user", "content": "How do I cancel my subscription mid-term?"}
    ]
  }'

python
resp = httpx.post(
    f"{HOST}/v1/inference/chat",
    headers=H,
    json={
        "model": "tenant/acme/acme-support",
        "messages": [{"role": "user", "content": "How do I cancel my subscription mid-term?"}],
    },
).json()["data"]
print(resp["content"])

javascript
const reply = await fetch(`${process.env.SCAIGRID_HOST}/v1/inference/chat`, {
  method: "POST",
  headers: { "Authorization": `Bearer ${process.env.SCAIGRID_API_KEY}`, "Content-Type": "application/json" },
  body: JSON.stringify({
    model: "tenant/acme/acme-support",
    messages: [{ role: "user", content: "How do I cancel my subscription mid-term?" }],
  }),
}).then((r) => r.json());

7. Iterate without unpublishing#

Edits sync the published frontend model on save. Tighten the prompt, drop the top-k, change a source weight — none of it requires an unpublish.

bash
curl -X PUT "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "rag_top_k": 4, "rag_min_score": 0.5 }'

If you want to take the persona off the air for a moment:

bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/unpublish" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY"

Unpublishing removes the frontend model entirely — its slug stops resolving, and it falls out of any model groups it had joined. The persona row stays put; re-publish later and the same slug comes back.

Done#

You have a tenant-scoped, RAG-grounded, branded persona that any downstream client can target by model slug. Iterate freely — every parameter is editable in place, publish/unpublish is reversible, and accounting flows through the standard ScaiGrid pipeline.