Poolnoodle Models

Why

Why we build our own models

Most AI platforms are resellers - routing requests to someone else’s models with a management layer on top. ScaiLabs does that too, but we also build our own models.

Why? Because real independence requires owning the full stack. When you depend on OpenAI or Anthropic for inference, you depend on their pricing, their availability, their content policies, and their roadmap. Poolnoodle gives our customers a genuine alternative - models they can run on their own hardware, fine-tune on their own data, and operate without any external dependency.

The Family

Poolnoodle production models

Three models, each optimised for different workloads and hardware constraints.

Poolnoodle Mini

Our smallest and fastest model. Runs on lightweight hardware including laptops and edge devices. Ideal for quick tasks: classification, summarisation, simple Q&A, and high-throughput scenarios where speed matters more than depth.

Poolnoodle Turbo

An advanced Mixture of Experts (MoE) model that activates only the relevant expert networks per request. Delivers the reasoning quality of a much larger model at a fraction of the compute cost. The workhorse for most business applications.

Poolnoodle Omni

A large omnimodal model that processes text, images, audio, and code in a single architecture. For complex tasks that require cross-modal understanding: document analysis with images, multimodal RAG, and advanced reasoning.

FixerNoodle

Intelligent router that analyses incoming requests and dispatches to the optimal Poolnoodle variant. Get the quality of Omni where you need it and the speed of Mini where you don't.

EmbedNoodle

Purpose-built embedding model for ScaiMatrix. Multilingual, optimised for retrieval and semantic search across the platform.

Technology

MOSAIC: 1M context on consumer hardware

Mixed-Order Sparsity with Attention-Informed Compression - our breakthrough technology that enables medium and large LLMs to run with 1M token context windows on consumer hardware like the NVIDIA DGX Spark.

MOSAIC dynamically compresses model weights based on attention patterns, allocating precision where it matters most. The result: near-lossless quality at a fraction of the memory footprint.

Ecosystem

Working together

Poolnoodle isn't just a collection of independent models - they're a collaborative system. FixerNoodle analyses each request and routes to the right model: Mini for fast, simple tasks; Turbo for complex reasoning at efficient cost; Omni when multimodal understanding is needed. EmbedNoodle powers the semantic search layer across everything.

The entire family works alongside external models too. ScaiGrid routes transparently between Poolnoodle, open-source models, and commercial APIs - giving you the flexibility to use the best model for each task.

Beyond Poolnoodle

Model flexibility

Open-source models

Run Llama, Mistral, Qwen, and other open models through ScaiGrid with the same routing, accounting, and management.

Commercial APIs

Route to OpenAI, Anthropic, Google, and other providers. Unified billing, logging, and fallback.

Custom fine-tunes

Fine-tune any model on your proprietary data via ScaiMind. Domain-specific AI that’s truly yours.

Interested in running Poolnoodle?

Available through ScaiGrid cloud, on-premises, or AI-in-a-Box.

Get in Touch →

Meet Poolnoodle.