Build a searchable knowledge base
You're going from zero to a tenant-shared knowledge base with:
- a hundred PDFs ingested,
- graph extraction on,
- one document quarantined from a single user,
- hybrid search wired up.
About 30 minutes if your documents are already on disk.
1. Pick your shape
Settle the moving parts before any API calls:
- Embedding model. Default to your tenant's standard. Once a collection is indexed, you can't change the model without forking — see step 7.
- Chunking strategy.
paragraph is a fine default; markdown for docs sites; code for source code; semantic when you have funds for the slightly better retrieval.
- Graph extraction. On if you need multi-hop questions ("which products mention X?"). Adds wall-clock time to ingestion and uses your chat model.
- Default access.
tenant for shared knowledge; restricted if only specific groups should see it.
2. Create the collection
1
2
3
4
5
6
7
8
9
10
11
12
13
14 | curl -X POST "$SCAIGRID_HOST/v1/modules/scaimatrix/collections" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Acme Handbook",
"description": "Acme employee handbook and policies",
"embedding_model": "openai/text-embedding-3-small",
"chunking_strategy": "paragraph",
"chunk_size": 512,
"chunk_overlap": 50,
"graph_enabled": true,
"graph_extraction_model": "scailabs/poolnoodle-omni",
"default_access": "tenant"
}'
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 | coll = httpx.post(
f"{HOST}/v1/modules/scaimatrix/collections",
headers=H,
json={
"name": "Acme Handbook",
"description": "Acme employee handbook and policies",
"embedding_model": "openai/text-embedding-3-small",
"chunking_strategy": "paragraph",
"chunk_size": 512,
"chunk_overlap": 50,
"graph_enabled": True,
"graph_extraction_model": "scailabs/poolnoodle-omni",
"default_access": "tenant",
},
).json()["data"]
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 | const res = await fetch(`${HOST}/v1/modules/scaimatrix/collections`, {
method: "POST",
headers: H,
body: JSON.stringify({
name: "Acme Handbook",
description: "Acme employee handbook and policies",
embedding_model: "openai/text-embedding-3-small",
chunking_strategy: "paragraph",
chunk_size: 512,
chunk_overlap: 50,
graph_enabled: true,
graph_extraction_model: "scailabs/poolnoodle-omni",
default_access: "tenant",
}),
});
const { data: coll } = await res.json();
|
3. Bulk-upload documents
The bulk endpoint accepts up to a handful of files in one request — handy when you can stream a directory.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18 | import httpx, os, time, glob
H = {"Authorization": f"Bearer {os.environ['SCAIGRID_API_KEY']}"}
HOST = os.environ["SCAIGRID_HOST"]
COLLECTION = coll["id"]
files = []
for path in glob.glob("./docs/*.pdf"):
files.append(("files", (os.path.basename(path), open(path, "rb"), "application/pdf")))
resp = httpx.post(
f"{HOST}/v1/modules/scaimatrix/collections/{COLLECTION}/documents/bulk",
headers=H,
files=files,
timeout=120,
).json()["data"]
doc_ids = [d["id"] for d in resp["documents"]]
|
| # Single-file equivalent — loop on the shell side for many files.
curl -X POST "$SCAIGRID_HOST/v1/modules/scaimatrix/collections/$COLLECTION/documents" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-F "file=@./docs/onboarding.pdf"
|
| const fd = new FormData();
for (const f of files) fd.append("files", f, f.name);
const r = await fetch(
`${HOST}/v1/modules/scaimatrix/collections/${coll.id}/documents/bulk`,
{ method: "POST", headers: { "Authorization": H.Authorization }, body: fd },
);
const { data } = await r.json();
const docIds = data.documents.map((d) => d.id);
|
4. Wait for indexing
Poll each document until it's indexed. With graph extraction on, large PDFs take longer because the extraction model has to run per document.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 | while True:
statuses = []
for doc_id in doc_ids:
d = httpx.get(
f"{HOST}/v1/modules/scaimatrix/collections/{COLLECTION}/documents/{doc_id}",
headers=H,
).json()["data"]
statuses.append(d["status"])
if all(s == "indexed" for s in statuses):
break
if any(s == "failed" for s in statuses):
failed = [i for i, s in enumerate(statuses) if s == "failed"]
print(f"{len(failed)} docs failed; check error_message")
break
print(f"indexed {statuses.count('indexed')}/{len(statuses)}")
time.sleep(5)
|
A quicker alternative: open the Ingestion Monitor admin page (/scaimatrix/ingestion) and watch progress visually.
5. Quarantine one document from one user
You have a single termination memo in this collection. HR can read everything; one named user must not see this document.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17 | # Break inheritance on this document — collection grants no longer flow.
curl -X PATCH "$SCAIGRID_HOST/v1/modules/scaimatrix/permissions/document/$DOC_ID/acl" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{"inherit_from_parent": false}'
# Add an explicit allow VIEWER for HR.
curl -X POST "$SCAIGRID_HOST/v1/modules/scaimatrix/permissions/document/$DOC_ID/acl/entries" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"principal_type": "group",
"principal_id": "grp_hr",
"ace_type": "allow",
"permissions": 49,
"inherit_to_children": true
}'
|
Alternatively, keep inheritance on and add only a deny READ for the single user — that ACE wins over every collection-level allow because deny has higher priority at the same level.
6. Hybrid search
1
2
3
4
5
6
7
8
9
10
11
12
13 | hits = httpx.post(
f"{HOST}/v1/modules/scaimatrix/collections/{COLLECTION}/search",
headers=H,
json={
"query": "What's our parental leave policy?",
"top_k": 5,
"search_type": "hybrid",
"min_score": 0.2,
},
).json()["data"]
for r in hits["results"]:
print(f"[{r['score']:.2f}] {r['document_name']} :: {r['content'][:200]}")
|
The quarantined user — calling with their own API key — gets results that omit any chunk from the quarantined document. They never see the document name, the chunk content, or its existence.
7. Combined search (vector + graph)
For collections with graph extraction on, combined_search blends vector hits with a graph traversal seeded by the same query.
| curl -X POST "$SCAIGRID_HOST/v1/modules/scaimatrix/collections/$COLLECTION/search/combined" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "products compatible with WidgetPro",
"vector_top_k": 5,
"graph_depth": 2,
"include_content": true
}'
|
You get back the vector hits plus the nodes / edges reachable within graph_depth of the most relevant nodes.
8. When the embedding model needs to change — fork
You can't safely re-embed in place: dimensions and similarity geometry can shift, and concurrent traffic would see mixed results. Fork instead:
| curl -X POST "$SCAIGRID_HOST/v1/modules/scaimatrix/collections/$COLLECTION/fork" \
-H "Authorization: Bearer $SCAIGRID_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Acme Handbook (v2 embeddings)",
"embedding_model": "openai/text-embedding-3-large",
"copy_acls": true,
"copy_metadata": true
}'
|
You get a new collection with the same ACLs and config but no documents. Re-ingest at your own pace; the original keeps serving traffic until you cut over.
Done
You have a tenant-shared knowledge base with graph extraction, fine-grained ACLs, and hybrid search. Iterate from here — every parameter is editable, and the ACL chokepoint keeps the data view consistent for every caller regardless of when permissions change.