---
title: Troubleshooting
path: troubleshooting/index
status: published
---

# Troubleshooting

Common failure modes and the fastest way to diagnose them. Symptoms in bold; likely causes and fixes underneath.

## Mail is accepted (202) but never delivers

**Symptoms:** `/v3/mail/send` returns 202, but the message sits in `QUEUED` or `PROCESSING` forever. No events arrive.

**Check:**

1. **Worker service alive?** `systemctl status scaisend-worker` or the equivalent. Restart if dead.
2. **Redis reachable?** `redis-cli ping`.
3. **Queue depth.** `redis-cli LLEN email:process`. If non-zero and growing, workers aren't keeping up (scale) or are deadlocked (restart).
4. **Worker logs.** Look for `render_error`, `database_error`, or stack traces.

Most common fix: the Worker service died silently. The API still accepts; nothing consumes. Restart.

## Messages stay in `PROCESSING` indefinitely

**Symptoms:** Messages enter `PROCESSING` but never progress to `RENDERED` or `SENDING`.

**Check:**

1. **Template errors.** `GET /v3/messages/{id}` — check `error_message`. If it's a template render failure, fix the template or delete the offending send.
2. **Worker crash mid-render.** The worker was handling the job, died, and the job record stayed in `PROCESSING`. Use `POST /v3/messages/{id}/retry` to requeue.

## All sends failing with 403 "Sender domain not verified"

**Symptoms:** Every `POST /v3/mail/send` returns 403 with "Sender domain not verified" even though you previously verified.

**Check:**

1. **Is the DNS still published?** Query `scaisend._domainkey.<yourdomain>` — does it resolve?
2. **Did someone rotate DKIM and not re-verify?** Check `GET /api/admin/domains/{id}`. If `verified: false`, run `POST /api/admin/domains/{id}/verify`.
3. **Did a PATCH set `is_active: false`?** Re-activate.

## Bounces climbing above 2%

**Symptoms:** `GET /v3/stats/sum` shows `bounces / requests > 0.02`.

**Check:**

1. **Sample recent bounces.** `GET /v3/suppression/bounces?limit=50&start_time=<1h ago>`. Look at `reason`:
   - Mostly `5.1.1 User unknown` → list quality problem. Your list has many invalid addresses.
   - Mostly `5.7.*` → reputation problem. ISPs are rejecting you.
   - Mostly `5.7.26 DMARC failure` → authentication misconfig. Run `POST /api/admin/domains/{id}/verify` and fix any failures.
2. **Check recent signup flows.** Was there a spike of obvious garbage addresses? Bot signups, typos?
3. **If reputation-driven, cool down sends.** Slow marketing sends; prioritize transactional (which ISPs evaluate separately, usually more charitably). Give yourself a week before sending the next big batch.

## Spam report rate > 0.1%

**Symptoms:** Recipients are hitting "Report spam" in large numbers.

**Check:**

1. **Recent campaigns.** Was there a send to a list that might not have been opt-in? An imported list that's older than 6 months? A broad-reach campaign that hit dormant users?
2. **Unsubscribe flow.** Is it obvious? Is the `List-Unsubscribe` header present (you can see it in any received message via "Show original")? Is one-click working?
3. **Suspend the offending campaign.** Cancel remaining queued messages in the batch (`POST /v3/messages/{id}/cancel` for each).

## Webhook endpoint gets disabled

**Symptoms:** `GET /v3/user/webhooks/{id}` shows `disabled_at` set.

**Check:**

1. **Endpoint returning 2xx?** Test with `curl -X POST <url> -d '{}'` to confirm it answers. 404s from typos are the most common cause.
2. **Endpoint too slow?** 30-second timeout. If your handler takes more than ~10s on average, you're flirting with timeouts during retries.
3. **TLS valid?** Certificate expired, mismatched hostname, self-signed cert that ScaiSend won't accept.
4. **DNS change?** Your endpoint hostname moved and the old IP stopped answering.

Fix the cause, then `PATCH /v3/user/webhooks/{id}` with `{"enabled": true}`.

## Outbound mail rejected by all recipients

**Symptoms:** Every message bounces with varied reputation-related `5.7.*` codes.

**Check:**

1. **PTR records on your outbound IPs.** `dig -x <IP>` — does it resolve? Is the resolved name's forward record the same IP (FCrDNS)? If not, fix with your IP provider. This is the #1 cause of mysterious reputation drops.
2. **RBL listings.** Check your outbound IPs against Spamhaus, SpamCop, Barracuda. [MXToolbox Blacklists](https://mxtoolbox.com/blacklists.aspx) is a quick survey.
3. **Rapid volume ramp-up.** If you went from 0 to 100k/day overnight, ISPs flag that as suspicious. Warm up IPs with gradually increasing volume.
4. **Content issues.** Some content patterns (excessive links, spammy phrases, bare IPs) tank deliverability. Test a sample message through `mail-tester.com`.

## "Missing required scope" on unexpected endpoints

**Symptoms:** Getting 403 with `Missing required scope: X` on an endpoint that previously worked.

**Check:**

1. **Did someone rotate the API key?** If the new key was created with a smaller scope set, it can't do what the old key could. Check `GET /v3/api_keys/{id}`.
2. **Did admin change role permissions?** User roles are editable. A permission may have been removed from the user's role.
3. **Is the user part of the expected group?** If group-to-role mapping is your primary RBAC mechanism, a group membership change in ScaiKey might have removed the permission.

`GET /v3/auth/me` shows the caller's effective permissions. Compare against what the endpoint needs.

## DNS verification stuck at `verified: false`

**Symptoms:** `POST /api/admin/domains/{id}/verify` keeps returning `verified: false` even though you believe the records are published.

**Check:**

1. **Wait for DNS TTL.** DNS propagates at the pace of the slowest cache in the chain. Even a "small TTL" provider takes minutes. Try again in 10 minutes.
2. **Query DNS the same way ScaiSend does.** `dig scaisend._domainkey.<domain> TXT`. Compare the returned value against what `GET /api/admin/domains/{id}/dns-records` tells you to publish. Look for:
   - **Trailing or leading whitespace** in the DNS value. Some providers strip or pad.
   - **Line breaks.** Long TXT records may be split across multiple DNS strings; ScaiSend joins them, but some DNS servers concatenate with spaces. Check the raw query result.
   - **Wrong base64.** Copy-paste errors are common; check a few characters from the beginning and end.
3. **Multiple TXT records.** If you have an existing SPF record, adding a second TXT can cause verification to see the wrong one. Merge SPF records into a single `v=spf1 ...` entry.

## Inbound SMTP server not receiving DSNs

**Symptoms:** Messages sometimes bounce silently — the upstream MX accepts but the message never delivers, and no bounce event appears in ScaiSend.

**Check:**

1. **Port 25 reachable from the internet?** `telnet <your-smtp-host> 25` from outside your network. If it refuses, check firewall rules.
2. **Forward-confirmed reverse DNS.** `dig -x <inbound-ip>` should return your hostname; `dig <hostname>` should return the same IP.
3. **SMTP service running?** `systemctl status scaisend-smtp`.
4. **Correct envelope sender on outbound messages?** ScaiSend puts a distinctive envelope-from that routes bounces back to its inbound server. If you've overridden this somehow, bounces go elsewhere.

## Admin UI can't log in

**Symptoms:** OAuth redirect lands on an error page.

**Check:**

1. **ScaiKey reachable from the browser?** The admin UI initiates OAuth against ScaiKey's URL; the browser must be able to reach it.
2. **`ADMIN_URL` correctly configured?** The redirect URI registered with ScaiKey must match `ADMIN_URL` exactly. Mismatch → "invalid redirect_uri" from ScaiKey.
3. **JWKS endpoint reachable from ScaiSend?** ScaiSend fetches JWKs at JWT validation. A blocked egress to ScaiKey means no login works server-side.

## Statistics look wrong

**Symptoms:** `/v3/stats` counts don't match what you think happened.

**Check:**

1. **Stats are aggregated on a daily-batch basis.** The current day may not be fully reflected until the next rebuild. For up-to-the-minute numbers, query `/v3/messages` directly and count.
2. **Sandbox messages don't count.** If you've been testing with a test key, those messages are excluded from stats.
3. **Rebuild if suspicious.** `POST /v3/stats/rebuild` replays events from source. Requires `stats.export`.

## Redis queue growing without bound

**Symptoms:** `redis-cli LLEN smtp:deliver` climbs monotonically.

**Check:**

1. **SMTP service alive?** If the consumer is dead, the queue fills.
2. **Outbound IP blocked?** If every send is being refused at the network layer (firewall, ISP blocking port 25), the SMTP service loops retrying without making progress. Check logs for connection-refused patterns.
3. **Rate-limited by a specific recipient ISP?** One domain's MX saying "slow down" can back up the queue if you have a lot of mail to that domain. Check retry counts — if one domain dominates, you're seeing a reputation issue with that specific ISP.

Temporarily drain by increasing SMTP service replicas. Long-term fix depends on the root cause.

## Still stuck?

- **Request IDs.** Every API response carries `X-Request-ID`. Include it when filing a support ticket.
- **Message IDs.** For delivery issues, the `message_id` from the send response is the key identifier. Pair it with the request ID.
- **Log context.** Ship structured logs to a searchable store so `grep request_id=...` across all three services is possible.

## Related

- [Deployment](deployment) — the architecture being troubleshot.
- [Health and Monitoring](health-and-monitoring) — the signals that should have warned you.
- [Bounce Handling](../concepts/bounce-handling) — deep dive on delivery failures.