Errors & retries
This page lists WebAgent's error codes and tells you which ones to retry, which to surface to the user, and which to fix in your own code.
Every error response has the same shape:
{
"code": "rate_limit_exceeded",
"detail": "Per-key concurrency limit (10) reached.",
"extra": { "limit": 10, "active": 10 }
}The HTTP status tells you the category; the code field is the stable contract — switch on it, never on detail (English prose, may change).
Code matrix
| Status | Code | Retry? | What to do |
|---|---|---|---|
| 400 | bad_request | ❌ | Fix the request body. Check the OpenAPI spec for the field. |
| 401 | unauthorized | ❌ | Key is missing, malformed, expired, or revoked past its 1-hour grace. Create a new one. |
| 402 | insufficient_credits | ❌ | Top up via Settings → Billing, or enable auto-recharge. |
| 402 | budget_exceeded | ❌ | Per-task max_cost_usd cap hit. Raise the cap or split the work. |
| 403 | forbidden | ❌ | Key valid, but the project doesn't grant it access to this resource. |
| 403 | safety_boundary_violated | ❌ | The agent refused on safety grounds. Read extra.reason; reword the instruction. |
| 404 | session_not_found, task_not_found, profile_not_found, … | ❌ | The id is wrong or the resource was deleted. |
| 409 | conflict | ❌ | State mismatch (e.g. cancel on a terminal task). Re-read state and decide. |
| 422 | validation_error | ❌ | Schema-level — extra.errors[] lists the offending fields. |
| 429 | rate_limit_exceeded | ✅ | Honour Retry-After; exponential back-off if absent. |
| 429 | too_many_concurrent_sessions | ✅ | Wait for an in-flight session to free up, or upgrade plan. |
| 5xx | internal_error | ✅ | Same call, exponential back-off. Capped at 3–5 attempts. |
| (network) | — | ✅ | Connection reset / timeout — retry idempotently. |
✅ = safe to retry without reasoning. ❌ = will keep failing until you change something.
Retry policy we recommend
import time, random
def with_retries(fn, *, attempts=4, base=0.5, cap=8.0):
for i in range(attempts):
try:
return fn()
except WebAgentError as e:
if e.code not in {"rate_limit_exceeded",
"too_many_concurrent_sessions",
"internal_error"}:
raise # not safe to retry
if i == attempts - 1:
raise
sleep_s = min(cap, base * 2**i) + random.uniform(0, 0.25)
time.sleep(e.retry_after_seconds or sleep_s)The Python and TypeScript SDKs ship this loop by default; the table above is for when you're calling the API directly.
Idempotency keys
Every mutating endpoint (POST /sessions, POST /sessions/{sid}/tasks, POST /messages, …) accepts an Idempotency-Key header:
POST /v1/projects/proj_demo_0001/do_anything/sessions
Idempotency-Key: 9b2f7c1e-…-uuidReplay the same UUID within 24 hours and you'll get the same response back (same session_id, same status code) — even after a network blip. Generate one UUID per logical action, not per retry.
Inside the task lifecycle
Errors during agent execution don't always fail the task — many are recoverable:
- Tool error — emitted as
task.action.failed; the agent decides to retry the tool, choose a different one, or fail the whole task. - Captcha / 2FA — emitted as
task.input_request; you answer viaPOST /intervene. - Hard cap hit — task transitions to
failedwitherror.code = budget_exceededorduration_exceeded. Spent credits are billed. - Safety refusal — task transitions to
failedwitherror.code = safety_boundary_violated. No credits charged for the refused step.
You see all four through the SSE stream.
SSE-specific failure modes
| Symptom | Cause | Fix |
|---|---|---|
| Stream stalls > 60 s | Network drop or proxy buffering | Reconnect with Last-Event-ID: <last_seen_id>. |
Last-Event-ID ignored | Buffer expired (older than 1 hour) | Re-fetch task state via GET /sessions/{sid}/tasks/{tid} and resume from current. |
| Duplicate events on reconnect | At-least-once delivery | Dedupe by event id (monotonic per-task). |
Next steps
- API Overview — every endpoint shares these conventions.
- Sessions & Tasks — the lifecycle states above are normative.