Build a searchable knowledge base

You're going from zero to a tenant-shared knowledge base with:

a hundred PDFs ingested,
graph extraction on,
one document quarantined from a single user,
hybrid search wired up.

About 30 minutes if your documents are already on disk.

1. Pick your shape#

Settle the moving parts before any API calls:

Embedding model. Default to your tenant's standard. Once a collection is indexed, you can't change the model without forking — see step 7.
Chunking strategy. paragraph is a fine default; markdown for docs sites; code for source code; semantic when you have funds for the slightly better retrieval.
Graph extraction. On if you need multi-hop questions ("which products mention X?"). Adds wall-clock time to ingestion and uses your chat model.
Default access. tenant for shared knowledge; restricted if only specific groups should see it.

2. Create the collection#

bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaimatrix/collections" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Acme Handbook",
    "description": "Acme employee handbook and policies",
    "embedding_model": "openai/text-embedding-3-small",
    "chunking_strategy": "paragraph",
    "chunk_size": 512,
    "chunk_overlap": 50,
    "graph_enabled": true,
    "graph_extraction_model": "scailabs/poolnoodle-omni",
    "default_access": "tenant"
  }'

python
coll = httpx.post(
    f"{HOST}/v1/modules/scaimatrix/collections",
    headers=H,
    json={
        "name": "Acme Handbook",
        "description": "Acme employee handbook and policies",
        "embedding_model": "openai/text-embedding-3-small",
        "chunking_strategy": "paragraph",
        "chunk_size": 512,
        "chunk_overlap": 50,
        "graph_enabled": True,
        "graph_extraction_model": "scailabs/poolnoodle-omni",
        "default_access": "tenant",
    },
).json()["data"]

javascript
const res = await fetch(`${HOST}/v1/modules/scaimatrix/collections`, {
  method: "POST",
  headers: H,
  body: JSON.stringify({
    name: "Acme Handbook",
    description: "Acme employee handbook and policies",
    embedding_model: "openai/text-embedding-3-small",
    chunking_strategy: "paragraph",
    chunk_size: 512,
    chunk_overlap: 50,
    graph_enabled: true,
    graph_extraction_model: "scailabs/poolnoodle-omni",
    default_access: "tenant",
  }),
});
const { data: coll } = await res.json();

3. Bulk-upload documents#

The bulk endpoint accepts up to a handful of files in one request — handy when you can stream a directory.

python
import httpx, os, time, glob

H = {"Authorization": f"Bearer {os.environ['SCAIGRID_API_KEY']}"}
HOST = os.environ["SCAIGRID_HOST"]
COLLECTION = coll["id"]

files = []
for path in glob.glob("./docs/*.pdf"):
    files.append(("files", (os.path.basename(path), open(path, "rb"), "application/pdf")))

resp = httpx.post(
    f"{HOST}/v1/modules/scaimatrix/collections/{COLLECTION}/documents/bulk",
    headers=H,
    files=files,
    timeout=120,
).json()["data"]

doc_ids = [d["id"] for d in resp["documents"]]

bash
# Single-file equivalent — loop on the shell side for many files.
curl -X POST "$SCAIGRID_HOST/v1/modules/scaimatrix/collections/$COLLECTION/documents" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -F "file=@./docs/onboarding.pdf"

javascript
const fd = new FormData();
for (const f of files) fd.append("files", f, f.name);
const r = await fetch(
  `${HOST}/v1/modules/scaimatrix/collections/${coll.id}/documents/bulk`,
  { method: "POST", headers: { "Authorization": H.Authorization }, body: fd },
);
const { data } = await r.json();
const docIds = data.documents.map((d) => d.id);

4. Wait for indexing#

Poll each document until it's indexed. With graph extraction on, large PDFs take longer because the extraction model has to run per document.

python
while True:
    statuses = []
    for doc_id in doc_ids:
        d = httpx.get(
            f"{HOST}/v1/modules/scaimatrix/collections/{COLLECTION}/documents/{doc_id}",
            headers=H,
        ).json()["data"]
        statuses.append(d["status"])
    if all(s == "indexed" for s in statuses):
        break
    if any(s == "failed" for s in statuses):
        failed = [i for i, s in enumerate(statuses) if s == "failed"]
        print(f"{len(failed)} docs failed; check error_message")
        break
    print(f"indexed {statuses.count('indexed')}/{len(statuses)}")
    time.sleep(5)

A quicker alternative: open the Ingestion Monitor admin page (/scaimatrix/ingestion) and watch progress visually.

5. Quarantine one document from one user#

You have a single termination memo in this collection. HR can read everything; one named user must not see this document.

bash
# Break inheritance on this document — collection grants no longer flow.
curl -X PATCH "$SCAIGRID_HOST/v1/modules/scaimatrix/permissions/document/$DOC_ID/acl" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"inherit_from_parent": false}'

# Add an explicit allow VIEWER for HR.
curl -X POST "$SCAIGRID_HOST/v1/modules/scaimatrix/permissions/document/$DOC_ID/acl/entries" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "principal_type": "group",
    "principal_id": "grp_hr",
    "ace_type": "allow",
    "permissions": 49,
    "inherit_to_children": true
  }'

Alternatively, keep inheritance on and add only a deny READ for the single user — that ACE wins over every collection-level allow because deny has higher priority at the same level.

6. Hybrid search#

python
hits = httpx.post(
    f"{HOST}/v1/modules/scaimatrix/collections/{COLLECTION}/search",
    headers=H,
    json={
        "query": "What's our parental leave policy?",
        "top_k": 5,
        "search_type": "hybrid",
        "min_score": 0.2,
    },
).json()["data"]

for r in hits["results"]:
    print(f"[{r['score']:.2f}] {r['document_name']} :: {r['content'][:200]}")

The quarantined user — calling with their own API key — gets results that omit any chunk from the quarantined document. They never see the document name, the chunk content, or its existence.

7. Combined search (vector + graph)#

For collections with graph extraction on, combined_search blends vector hits with a graph traversal seeded by the same query.

bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaimatrix/collections/$COLLECTION/search/combined" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "products compatible with WidgetPro",
    "vector_top_k": 5,
    "graph_depth": 2,
    "include_content": true
  }'

You get back the vector hits plus the nodes / edges reachable within graph_depth of the most relevant nodes.

8. When the embedding model needs to change — fork#

You can't safely re-embed in place: dimensions and similarity geometry can shift, and concurrent traffic would see mixed results. Fork instead:

bash
curl -X POST "$SCAIGRID_HOST/v1/modules/scaimatrix/collections/$COLLECTION/fork" \
  -H "Authorization: Bearer $SCAIGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Acme Handbook (v2 embeddings)",
    "embedding_model": "openai/text-embedding-3-large",
    "copy_acls": true,
    "copy_metadata": true
  }'

You get a new collection with the same ACLs and config but no documents. Re-ingest at your own pace; the original keeps serving traffic until you cut over.

Done#

You have a tenant-shared knowledge base with graph extraction, fine-grained ACLs, and hybrid search. Iterate from here — every parameter is editable, and the ACL chokepoint keeps the data view consistent for every caller regardless of when permissions change.