Platform
ScaiWave ScaiGrid ScaiCore ScaiBot ScaiDrive ScaiKey Models Tools & Services
Solutions
Organisations Developers Internet Service Providers Managed Service Providers AI-in-a-Box
Resources
Support Documentation Blog Downloads
Company
About Research Careers Investment Opportunities Contact
Log in

Webhooks Deep Dive

Webhook delivery internals — retry policy, signature verification edge cases, auto-disable behavior, and scaling guidance. For the basic setup and signature-verification recipe, see Webhooks (guide).

Delivery flow#

When ScaiSend needs to emit an event:

  1. Event is created. The SMTP service (delivered/bounce/deferred) or the Worker (processed/dropped) or the tracking API (open/click/unsubscribe) writes a webhook_deliveries row for each subscribed endpoint.
  2. Delivery queue picks it up. An arq worker consumes the delivery queue and makes the HTTP request.
  3. Response is evaluated. 2xx within 30 seconds is a success. Anything else fails.
  4. Retries scheduled. Failed deliveries schedule a retry with exponential backoff. ScaiSend persists the delivery row; retries survive service restarts.

Retry schedule#

Attempt Delay after previous
2 60 seconds
3 300 seconds (5 min)
4 900 seconds (15 min)
5 3600 seconds (1 hour)
6 7200 seconds (2 hours)

Total elapsed time from first send to final retry: about 4 hours. After attempt 6, delivery is marked FAILED and not retried further.

Individual deliveries don't disable the endpoint; the endpoint's failure_count tracks consecutive failures across deliveries. Once it hits 10, the endpoint is auto-disabled:

json
1
2
3
4
5
6
{
  "id": "wh_01HXYZ",
  "enabled": true,
  "disabled_at": "2026-04-23T15:00:00Z",
  "failure_count": 10
}

Re-enable explicitly after fixing the underlying problem:

bash
1
2
3
4
curl -X PATCH https://scaisend.scailabs.ai/v3/user/webhooks/wh_01HXYZ \
  -H "Authorization: Bearer $SCAISEND_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"enabled": true}'

At-least-once semantics#

ScaiSend guarantees at-least-once delivery. In practice that means:

  • A single event might be delivered multiple times. If your handler returns 2xx but the response is lost in transit (connection reset after the TCP ack but before the HTTP response), ScaiSend retries. You'll see the same event_id again.
  • Your handler must be idempotent. Use event_id as the dedupe key. A simple Redis SETNX with a 7-day TTL works for most volumes.
python
1
2
3
4
5
6
7
8
import redis

r = redis.Redis.from_url(os.environ["REDIS_URL"])

async def handle_event(event: dict):
    if not r.set(f"scaisend:seen:{event['event_id']}", "1", nx=True, ex=86400 * 7):
        return  # duplicate, skip
    # ... process event

Ordering#

No guarantees. Events for the same message can arrive out of order. Specifically:

  • A delivered can arrive before a processed if the processed delivery is retrying and the delivered delivery succeeds on first try.
  • open events can arrive days after delivered — that's normal (the recipient got the mail days ago, opened now).
  • After a retry, late arrivals are common.

Don't rely on arrival order. Use the timestamp field on the payload to reconstruct the actual sequence if you care. For most use cases, you don't — you care about the latest status, which you can query from /v3/messages/{id}.

Signature verification edge cases#

Clock skew#

The X-ScaiSend-Timestamp header is the Unix timestamp when ScaiSend computed the signature. Compare against your server's wall clock, allowing ~5 minutes of skew:

python
1
2
if abs(time.time() - int(timestamp)) > 300:
    return unauthorized()

NTP-synchronized servers typically skew under a second. Five minutes is generous; tighter if your infrastructure is clean.

Replay attacks#

The timestamp rejection is what prevents replay. A captured signed request is a valid HMAC forever — but it expires from your acceptance window in 5 minutes.

Rotation gap#

When you rotate a signing secret, the old secret stops validating immediately. Events in-flight (already queued with the old signature) will fail verification on your side.

Solution: during a rotation, accept either signature for a short grace period:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from typing import Iterable

def verify_any(body: bytes, ts: str, sig: str, secrets: Iterable[str]) -> bool:
    for secret in secrets:
        if verify(body, ts, sig, secret):
            return True
    return False

# During rotation
valid = verify_any(body, ts, sig, [NEW_SECRET, OLD_SECRET])

After 10 minutes (longer than the longest expected in-flight retry window for recent events), stop accepting the old secret.

Signature on a body your framework mutates#

Some web frameworks parse and re-serialize JSON before your handler runs. If the re-serialization differs (key order, whitespace), the HMAC won't verify.

Fix: verify against the raw request body bytes, not the parsed-and-serialized JSON. In Express, use express.raw(); in FastAPI, read with await request.body() before parsing.

OAuth2 authentication#

If your webhook endpoint sits behind an OAuth2 flow, register credentials with the endpoint:

bash
1
2
3
4
5
6
7
8
9
curl -X POST https://scaisend.scailabs.ai/v3/user/webhooks \
  -H "Authorization: Bearer $SCAISEND_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://api.example.com/webhooks/scaisend",
    "enabled_events": ["*"],
    "oauth_client_id": "client_abc",
    "oauth_client_secret": "secret_xyz"
  }'

ScaiSend does a client-credentials grant to your token endpoint before each delivery (caching the token until close to expiry), then passes it as Authorization: Bearer <token> on the delivery request. This is optional; most setups use the signature-only model.

Scale#

Typical webhook volume:

  • Per sent message: 2–4 events (processed, delivered, maybe open, maybe click). Sometimes more (deferred → delivered, or multiple clicks).
  • Burst: a marketing send to 100k recipients produces ~400k events clustered within a few minutes.

Your endpoint should handle 10× sustained peak without degrading. If your normal load is 100 req/sec, aim for a comfortable 1000 req/sec ceiling before response times start climbing.

Scaling patterns#

  • Respond fast, process async. Accept the webhook, enqueue to your own worker queue, return 200. Don't do DB writes on the synchronous path.
  • Batch in your consumer. If you're writing to an analytics system, batch inserts (50–100 events per insert) rather than one row per event.
  • Consider a dedicated webhook fleet. Scale horizontally; don't share the webhook endpoint with your main API.

Diagnosing delivery failures#

Inspect an endpoint's recent history:

bash
1
2
curl https://scaisend.scailabs.ai/v3/user/webhooks/wh_01HXYZ \
  -H "Authorization: Bearer $SCAISEND_API_KEY"

Fields:

Field Meaning
last_success_at Last 2xx response received
last_failure_at Last delivery that didn't get 2xx
failure_count Consecutive failures since last success
disabled_at Auto-disabled after 10 failures

If failure_count is climbing, check:

  1. Is the URL correct? A typo deploys a trivially fixable problem.
  2. Is the endpoint reachable from ScaiSend? Firewall, load balancer, DNS.
  3. Is TLS valid? ScaiSend verifies TLS certificates. A self-signed cert or mismatched hostname will fail.
  4. Is your handler returning 2xx? Any redirect (3xx) is treated as failure. Any 4xx/5xx is a failure.
  5. Is your handler fast enough? > 30 seconds is a timeout, counted as failure.

Logs on your side (with the X-ScaiSend-Event header logged) are the fastest way to diagnose.

Testing your webhook handler#

Use a test API key and send mail to yourself:

bash
1
2
3
4
curl -X POST https://scaisend.scailabs.ai/v3/mail/send \
  -H "Authorization: Bearer $SCAISEND_TEST_KEY" \
  -H "Content-Type: application/json" \
  -d '{...}'

Test keys only produce a processed event (no actual delivery, so no delivered or bounce). To test the full event lifecycle, use a live key with sandbox_mode.enable: false and send to an address you control — you'll see the full sequence.

Updated 2026-05-17 01:33:27 View source (.md) rev 1