Images
Generate images from text prompts. Supports multiple providers (DALL-E 3, Stable Diffusion, Imagen) through a single endpoint.
Endpoint: POST /v1/inference/images/generate
Basic request#
1 2 3 4 5 6 7 8 9 10 | |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
Parameters#
| Field | Type | Notes |
|---|---|---|
model |
string (required) | Image model slug |
prompt |
string (required) | Text description |
n |
integer | Number of images to generate (typically 1–10) |
size |
string | "1024x1024", "1792x1024", "1024x1792", model-dependent |
quality |
string | "standard" or "hd" for DALL-E 3 |
style |
string | "vivid" or "natural" for DALL-E 3 |
response_format |
string | "url" (default, short-lived proxy URL) or "b64_json" (inline base64) |
metadata |
object | Passed through to the provider |
Response shape#
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
When response_format is "url", the URL points at ScaiGrid's media proxy — the image is stored in our object storage and served through a short-lived token. URLs expire after 1 hour by default; download the image or re-encode it if you need longer retention.
When response_format is "b64_json", the response contains inline base64 instead. Useful when your client can't make a second HTTP request to fetch the image.
Downloading and saving#
1 2 3 4 5 6 7 8 9 10 11 | |
1 2 3 4 5 | |
Available image models#
1 2 | |
Common options:
| Model | Size options | Notes |
|---|---|---|
openai/dall-e-3 |
1024x1024, 1792x1024, 1024x1792 | High quality, style/quality parameters |
openai/dall-e-2 |
256x256, 512x512, 1024x1024 | Older, cheaper |
stability/sd-xl |
1024x1024 | Self-hosted via ScaiInfer |
Availability depends on which providers your tenant has configured.
Revised prompts#
DALL-E 3 rewrites prompts for better results. The rewritten version comes back in revised_prompt:
1 2 3 4 | |
This is useful for debugging ("why did I get a cartoon when I asked for a photo?"). You can disable prompt rewriting by prefixing your prompt with I NEED to test how the tool works with extremely simple prompts. DO NOT add any detail, just use it AS-IS: — this is OpenAI's documented escape hatch.
Error cases#
| Code | Meaning |
|---|---|
VALIDATION_ERROR |
Missing prompt or invalid size |
MODEL_ACCESS_DENIED |
Image model not enabled for your tenant |
BACKEND_ERROR |
Safety filter triggered, or upstream rejected the prompt |
BUDGET_EXCEEDED |
Image generation often has its own budget cap (expensive) |
What's next#
- Audio — speech synthesis and transcription.
- Chat Completions — send generated images back to a vision model.
- OpenAI Compatibility —
/oai/v1/images/generationsworks identically.