Rate Limiting
ScaiSend rate-limits requests to protect the service from floods and to keep individual tenants from starving each other. This page describes the limits, the response format, and how to design a client that stays under them.
Where limits apply#
Rate limits are enforced at the API layer, per credential (API key or user JWT), on specific endpoints. The most commonly-hit limits:
| Endpoint | Default limit |
|---|---|
POST /v3/mail/send |
10,000 requests/second per tenant |
POST /v3/user/webhooks |
10 requests/minute per tenant |
POST /v3/api_keys |
10 requests/minute per tenant |
GET /v3/messages |
100 requests/second per credential |
| Everything else | 1,000 requests/second per credential |
Limits are global defaults for a standard ScaiSend deployment; a self-hosted operator can tune them via configuration. Per-tenant overrides are possible but rare.
Response when rate-limited#
A limited request gets 429 Too Many Requests:
1 2 3 4 5 6 7 8 | |
| Header | Meaning |
|---|---|
Retry-After |
Seconds to wait before the next attempt |
X-RateLimit-Limit |
Requests allowed in the current window |
X-RateLimit-Remaining |
Requests remaining in the window |
X-RateLimit-Reset |
Unix timestamp when the window resets |
Always honor Retry-After. Sleeping exactly that long before retry guarantees you'll be just inside the next window when you try again.
A rate-limit-aware send loop#
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | |
Three characteristics of a good loop:
- Honors
Retry-Afteron 429. Don't sleep for your own arbitrary duration; use the header. - Exponential backoff for 5xx. Bigger delay each attempt; add jitter to avoid synchronized retries from many clients.
- Caps retries. Four attempts total is usually enough; beyond that, something structural is wrong — fail loudly.
Proactive pacing#
If you know you need to send ~100k messages, and your limit is 10,000 RPS, you have headroom. The problem is when a burst exceeds your budget.
Token bucket. Rate-limit your side before you even call:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | |
Setting your client's rate to half the server's limit gives you headroom for occasional bursts and concurrent clients.
Spreading sends over time#
For marketing campaigns, use send_at to distribute across a window:
1 2 3 4 5 6 7 | |
This way, you queue everything at once (fast), and ScaiSend's scheduler releases messages smoothly over the window. Your SMTP infrastructure also enjoys smoother load.
Per-endpoint considerations#
/v3/mail/send#
The big one. 10,000/s is the primary limit; realistically, a single client can't sustain that, so you won't hit it unless you're a platform with many concurrent senders.
/v3/messages (list)#
Lower limit (100/s) because this hits the database hard on each call. If you need to sweep through messages, paginate with page_size=100 and space your pagination out.
Admin endpoints#
Administrative operations (creating keys, creating domains, rotating DKIM) have low limits — these are human-initiated, not automation. Typical limits: 10/minute.
Monitoring#
The X-RateLimit-Remaining header tells you how close you are to the limit. Log it periodically:
1 2 | |
Alert if you see X-RateLimit-Remaining consistently low on a particular credential. That's your signal to either spread your send load or split across multiple API keys (one per sender application).
503 vs 429#
429 is rate-limiting: "slow down." 503 is service-unavailable: "a dependency is down." Both are retryable, but they mean different things:
429— your fault (or volume) for the request rate.503— something is unhealthy server-side; you'll get through once it recovers.
Retry both. Differentiate in monitoring: persistent 429 means you need to pace; persistent 503 means you should be paging someone.
Related#
- Errors — general error handling.
- Your First Integration — retry loop recipe.
- Sending Mail —
send_atfor spreading load.