Build a knowledge-grounded persona
You're going from zero to a deployed persona that:
- answers from two ScaiMatrix collections with different weights,
- has a branded avatar,
- is published into a model group your downstream apps can target,
- can be unpublished and edited without losing state.
Roughly 20 minutes if you have the collections ready.
1. Decide the persona's shape
Before any API calls, settle these:
- Identity. Name, slug, voice. Will it speak as "Acme Support" or as a generic helper?
- Underlying model. Which existing frontend model slug do you wrap? The persona inherits its modality, capabilities, context window, and pricing.
- Sources. Which ScaiMatrix collections (or ScaiDrive shares) does it read from? Get their ids ready.
- RAG strategy. Start with
single_step. Move to multi_step or agentic only after you've measured retrieval quality on representative questions.
2. Create the persona
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 | curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Acme Support Specialist",
"slug": "acme-support",
"model_slug": "scailabs/poolnoodle-omni",
"system_prompt": "You are Avery, a friendly Acme support specialist. Answer concisely from the provided context. If the context does not cover the question, say so and offer to find someone who can help.",
"rag_enabled": true,
"rag_strategy": "single_step",
"rag_top_k": 6,
"rag_min_score": 0.4,
"default_params": { "temperature": 0.2 },
"status": "draft"
}'
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22 | import httpx, os
H = {"Authorization": f"Bearer {os.environ['SCAIGRID_API_KEY']}"}
HOST = os.environ["SCAIGRID_HOST"]
persona = httpx.post(
f"{HOST}/v1/modules/scaipersona/personas",
headers=H,
json={
"name": "Acme Support Specialist",
"slug": "acme-support",
"model_slug": "scailabs/poolnoodle-omni",
"system_prompt": "You are Avery, a friendly Acme support specialist. Answer concisely from the provided context. If the context does not cover the question, say so and offer to find someone who can help.",
"rag_enabled": True,
"rag_strategy": "single_step",
"rag_top_k": 6,
"rag_min_score": 0.4,
"default_params": {"temperature": 0.2},
"status": "draft",
},
).json()["data"]
PERSONA_ID = persona["id"]
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 | const persona = await fetch(`${process.env.SCAIGRID_HOST}/v1/modules/scaipersona/personas`, {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.SCAIGRID_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
name: "Acme Support Specialist",
slug: "acme-support",
model_slug: "scailabs/poolnoodle-omni",
system_prompt: "You are Avery, a friendly Acme support specialist. Answer concisely from the provided context.",
rag_enabled: true,
rag_strategy: "single_step",
rag_top_k: 6,
rag_min_score: 0.4,
default_params: { temperature: 0.2 },
status: "draft",
}),
}).then((r) => r.json()).then((j) => j.data);
|
3. Attach two weighted sources
The handbook is authoritative; the FAQ collection is supplementary. Weight the handbook higher.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19 | curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/sources" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source_type": "collection",
"source_id": "col_handbook",
"source_name": "Acme Handbook",
"weight": 2.0
}'
curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/sources" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source_type": "collection",
"source_id": "col_faq",
"source_name": "Customer FAQ",
"weight": 1.0
}'
|
| for body in [
{"source_type": "collection", "source_id": "col_handbook",
"source_name": "Acme Handbook", "weight": 2.0},
{"source_type": "collection", "source_id": "col_faq",
"source_name": "Customer FAQ", "weight": 1.0},
]:
httpx.post(
f"{HOST}/v1/modules/scaipersona/personas/{PERSONA_ID}/sources",
headers=H, json=body,
).raise_for_status()
|
| for (const body of [
{ source_type: "collection", source_id: "col_handbook", source_name: "Acme Handbook", weight: 2.0 },
{ source_type: "collection", source_id: "col_faq", source_name: "Customer FAQ", weight: 1.0 },
]) {
await fetch(
`${process.env.SCAIGRID_HOST}/v1/modules/scaipersona/personas/${persona.id}/sources`,
{ method: "POST", headers: { "Authorization": `Bearer ${process.env.SCAIGRID_API_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify(body) },
);
}
|
4. Upload an avatar
| curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/avatar" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-F "file=@avery.png"
|
Limits: 2 MB max; PNG / JPEG / GIF / WebP / SVG. The avatar URL ends up on the published frontend model, so any model-picker UI shows the persona's face.
5. Publish
| curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/publish" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{"group_ids": ["mg_support_bots"]}'
|
The optional group_ids adds the new frontend model to existing model groups. The caller's role limits which groups they can target — global and cross-tenant groups are ignored unless the caller is a super-admin. Bad group ids are silently skipped; check the model group's membership afterwards to confirm.
6. Invoke through the standard inference endpoint
The persona is now a frontend model. Callers never touch a ScaiPersona route to use it.
| curl -X POST "$SCAIGRID_HOST/v1/inference/chat" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "tenant/acme/acme-support",
"messages": [
{"role": "user", "content": "How do I cancel my subscription mid-term?"}
]
}'
|
| resp = httpx.post(
f"{HOST}/v1/inference/chat",
headers=H,
json={
"model": "tenant/acme/acme-support",
"messages": [{"role": "user", "content": "How do I cancel my subscription mid-term?"}],
},
).json()["data"]
print(resp["content"])
|
| const reply = await fetch(`${process.env.SCAIGRID_HOST}/v1/inference/chat`, {
method: "POST",
headers: { "Authorization": `Bearer ${process.env.SCAIGRID_API_KEY}`, "Content-Type": "application/json" },
body: JSON.stringify({
model: "tenant/acme/acme-support",
messages: [{ role: "user", content: "How do I cancel my subscription mid-term?" }],
}),
}).then((r) => r.json());
|
7. Iterate without unpublishing
Edits sync the published frontend model on save. Tighten the prompt, drop the top-k, change a source weight — none of it requires an unpublish.
| curl -X PUT "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "rag_top_k": 4, "rag_min_score": 0.5 }'
|
If you want to take the persona off the air for a moment:
| curl -X POST "$SCAIGRID_HOST/v1/modules/scaipersona/personas/$PERSONA_ID/unpublish" \
-H "Authorization: Bearer $SCAIGRID_API_KEY"
|
Unpublishing removes the frontend model entirely — its slug stops resolving, and it falls out of any model groups it had joined. The persona row stays put; re-publish later and the same slug comes back.
Done
You have a tenant-scoped, RAG-grounded, branded persona that any downstream client can target by model slug. Iterate freely — every parameter is editable in place, publish/unpublish is reversible, and accounting flows through the standard ScaiGrid pipeline.