Build a custom MCP client

This tutorial builds an MCP client from scratch that uses ScaiGrid as its tool source. The client connects, lists tools, lets an LLM pick one, calls it, and feeds the result back. By the end you'll have a working agent loop where tool discovery and invocation are entirely runtime — no hardcoded endpoint maps.

You need:

Python 3.10+ or Node 18+.
The MCP SDK: pip install mcp or npm install @modelcontextprotocol/sdk.
A ScaiGrid API key (sgk_...) and the host URL.
An LLM you can call separately — for this tutorial we'll use ScaiGrid's own inference_chat tool as the agent's brain, but you can swap in any client.

bash
export SCAIGRID_HOST="https://scaigrid.scailabs.ai"
export SCAIGRID_API_KEY="sgk_..."

1. Connect and list#

The streamable-HTTP transport opens one long-lived bidirectional connection. List tools right after initialize — the result is filtered to what your token can call.

python
import asyncio, os, json
from mcp import ClientSession
from mcp.client.streamable_http import streamablehttp_client

async def connect():
    url = f"{os.environ['SCAIGRID_HOST']}/mcp"
    headers = {"Authorization": f"Bearer {os.environ['SCAIGRID_API_KEY']}"}
    return streamablehttp_client(url, headers=headers)

async def main():
    async with await connect() as (read, write, _):
        async with ClientSession(read, write) as session:
            await session.initialize()
            tools = (await session.list_tools()).tools
            print(f"Got {len(tools)} tools")
            for t in tools[:5]:
                print(f" - {t.name}: {t.description}")

asyncio.run(main())

javascript
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StreamableHTTPClientTransport } from
  "@modelcontextprotocol/sdk/client/streamableHttp.js";

const transport = new StreamableHTTPClientTransport(
  new URL(`${process.env.SCAIGRID_HOST}/mcp`),
  { requestInit: { headers: {
      Authorization: `Bearer ${process.env.SCAIGRID_API_KEY}` } } }
);
const client = new Client(
  { name: "demo", version: "0.1" }, { capabilities: {} });
await client.connect(transport);

const { tools } = await client.listTools();
console.log(`Got ${tools.length} tools`);
for (const t of tools.slice(0, 5)) console.log(` - ${t.name}: ${t.description}`);

The list mixes three sources transparently: core tools, module-contributed tools (anything from modules enabled for your tenant), and remote.* tools from cloud MCP servers your user or tenant has registered through ScaiLink. You don't have to distinguish them in code.

2. Pick the chat tool, run a completion#

python
result = await session.call_tool("inference_chat", {
    "model": "scailabs/poolnoodle-omni",
    "messages": [{"role": "user", "content": "What's 12 squared?"}],
    "max_tokens": 60,
})
data = json.loads(result.content[0].text)
print(data["content"])

javascript
const result = await client.callTool({
  name: "inference_chat",
  arguments: {
    model: "scailabs/poolnoodle-omni",
    messages: [{ role: "user", content: "What's 12 squared?" }],
    max_tokens: 60,
  },
});
const data = JSON.parse(result.content[0].text);
console.log(data.content);

Every tool result is wrapped in a single text content block holding a JSON-encoded payload. Decode it with json.loads / JSON.parse.

3. Filter the catalog for an agent loop#

A real agent doesn't expose all 80+ tools to its LLM — it picks a relevant subset based on the task. Filter the list yourself:

python
all_tools = (await session.list_tools()).tools

# Keep tools relevant to a "knowledge research" agent
allowed = {"inference_chat", "inference_embed", "models_list"}
allowed |= {t.name for t in all_tools if t.name.startswith("matrix_")}
allowed |= {t.name for t in all_tools if t.name.startswith("remote.")}

scoped = [t for t in all_tools if t.name in allowed]

Pass scoped to your LLM as the available tools. The LLM sees a sensible task-shaped surface and won't hallucinate a call to tenants_create halfway through your research workflow.

4. End-to-end agent loop#

The classic shape: feed user goal → LLM picks tool → call MCP → feed result back → repeat.

python
async def agent_loop(session, user_goal: str):
    all_tools = (await session.list_tools()).tools
    tool_specs = [
        {"type": "function", "function": {
            "name": t.name,
            "description": t.description,
            "parameters": t.inputSchema,
        }}
        for t in all_tools
        if t.name in {"inference_chat", "models_list", "accounting_usage_summary"}
    ]

    messages = [{"role": "user", "content": user_goal}]

    for _ in range(5):  # cap iterations
        # The "brain" call — uses inference_chat itself
        brain = await session.call_tool("inference_chat", {
            "model": "scailabs/poolnoodle-omni",
            "messages": messages,
            "tools": tool_specs,
        })
        data = json.loads(brain.content[0].text)

        if not data.get("tool_calls"):
            return data["content"]  # natural-language answer, done

        for call in data["tool_calls"]:
            name = call["function"]["name"]
            args = json.loads(call["function"]["arguments"])
            tool_result = await session.call_tool(name, args)
            messages.append({
                "role": "assistant",
                "content": None,
                "tool_calls": [call],
            })
            messages.append({
                "role": "tool",
                "tool_call_id": call["id"],
                "content": tool_result.content[0].text,
            })

    return "loop cap reached"

The TypeScript shape is the same: listTools() → filter to a subset → callTool({ name: "inference_chat", arguments: { ..., tools: toolSpecs } }) → parse the result → if tool_calls returned, call each and feed the result back as a role: "tool" message → repeat with an iteration cap.

Run it with a goal like "Tell me my token spend for today and recommend a cheaper model if I'm above 80% of my budget."

5. Handling remote tools#

If your user (or tenant) has registered cloud MCP servers through ScaiLink, those tools show up in the same list_tools response with names like remote.tenant.slack-acme.post_message. The agent loop above works without modification — call_tool routes the remote.* prefix through ScaiLink's outbound client transparently.

Two things to know:

The tool's description, input schema, and behaviour come from the upstream server, not ScaiGrid. Inspect them at runtime; they may change.
Errors from upstream are returned with code set to the upstream's error class name plus a message. Treat them as opaque errors at the agent layer.

6. Hygiene#

Reconnect on transport drops; cap loop iterations as a circuit breaker; re-list tools periodically because modules can be enabled or disabled mid-session; let ScaiMCP enforce permissions instead of caching decisions client-side — treat PERMISSION_DENIED as a normal failure mode.

Done#

You have a working MCP agent that consumes ScaiGrid through a unified, permission-filtered catalog. The same pattern works for any LLM brain — MCP is the transport, the brain is yours.