Sessions & Tasks
A session is one runtime container — it owns the browser, the profile (cookies / login state), and the workspace (file system). A task is one instruction that runs inside a session. You can fire follow-up tasks at the same session; they share the state the previous task left behind.
Resource shape
project
└── session id: sess_…
├── browser, profile, workspace
└── task id: task_…
├── instructions "Search Hacker News..."
├── status running | done | …
└── events (SSE) task.status_changed, task.message, …Lifecycle (seven states)
stateDiagram-v2
[*] --> pending
pending --> running
running --> awaiting_input : task.input_request
awaiting_input --> running : POST /intervene
running --> paused : POST /pause
paused --> running : POST /resume
running --> done
running --> failed
pending --> canceled
running --> canceled : POST /cancel
awaiting_input --> canceled
paused --> canceled| State | Meaning | Next moves |
|---|---|---|
pending | Accepted, queued for an agent slot | → running, canceled |
running | Agent is actively working | → done, failed, awaiting_input, paused, canceled |
awaiting_input | Agent paused itself; needs you to answer | → running (via intervene) |
paused | You paused it (manual) | → running (via resume), canceled |
done | Completed successfully; output populated | terminal |
failed | Hit an error; error.code and error.detail populated | terminal |
canceled | You canceled (or scheduled max-duration tripped) | terminal |
A task in any non-terminal state holds session resources. Cap with max_duration_minutes to bound that.
Submitting a task
from web_agent import Client
from web_agent.v1.types import CreateSessionRequest, RecordingConfigRequest
session = await client.sessions.create(CreateSessionRequest(
instructions="Find the top 5 Show HN posts from the last 24 hours.",
model="claude-sonnet-4.6",
max_cost_usd="0.50",
max_duration_minutes=10,
recording=RecordingConfigRequest(enabled=True),
keep_alive=True,
))
task = session.tasks[0]| Field | Type | Notes |
|---|---|---|
instructions | string | The task in plain English. Up to 10 000 chars. |
model | string | Model id, e.g. claude-sonnet-4.6, gemini-3-flash. |
max_cost_usd | string | Decimal-as-string. Hard cap. |
max_duration_minutes | int | 1–10 080 (one week). |
recording | object | {enabled, quality, capture_during_take_control}; omit for off. |
keep_alive | bool | When the task ends, keep the session warm for follow-up tasks. |
allowed_actions | string[] | Whitelist of tool actions the agent may call. Empty = all allowed. |
profile_id | string | Reuse cookies/auth from a saved profile. |
The full schema is in the OpenAPI spec.
Follow-up tasks
from web_agent.v1.types import CreateTaskRequest
followup = await client.sessions.create_task(
session.id,
CreateTaskRequest(
instructions="Now click into the first post and summarise the discussion.",
),
)The follow-up runs in the same browser, with the same cookies, against the same DOM the previous task left.
Events
Every task emits a Server-Sent Events stream:
GET /v1/projects/{pid}/do_anything/sessions/{sid}/tasks/{tid}/events
Authorization: Bearer wa_…Eleven event types (the envelope is the same; data shape varies):
| Type | When |
|---|---|
task.status_changed | State transition |
task.message | Agent or user message in the chat thread |
task.action.started | Agent invoked a tool |
task.action.completed | Tool returned |
task.action.failed | Tool threw |
task.screenshot | New browser frame (url is short-lived) |
task.input_request | Agent paused; needs you to answer |
task.input_request_resolved | Your intervene was accepted |
task.cost_update | Per-step cost delta |
task.completed | Terminal; output populated |
stream.heartbeat | Every ~15 s; harmless |
Reconnect cleanly:
GET …/events
Last-Event-ID: 142The server replays events with id > 142 so you don't miss anything.
Input request (human in the loop)
When the agent hits a captcha, a 2FA prompt, or any judgment call, it emits task.input_request:
{
"type": "task.input_request",
"data": {
"input_request_id": "ir_01HXX…",
"prompt": "I see a 'Verify you're human' challenge. Solve it for me?",
"schema": { "type": "object", "properties": { "solved": { "type": "boolean" } } }
}
}You answer via POST /intervene:
await client.messages.intervene(
session.id, task.id,
input_request_id="ir_01HXX…",
response={"solved": True},
)The task transitions back to running. The whole cycle is one round-trip; no polling.
Profiles
A profile is a reusable browser identity: cookies, local storage, auth state. Reference one when creating a session:
await client.sessions.create(CreateSessionRequest(
instructions="Open my LinkedIn inbox and reply to the latest message.",
profile_id="prof_linkedin_main",
))The first time, set up the profile manually in the Console (sign in, accept cookies, do whatever). Future sessions reuse it.
Workspaces
A workspace is a persistent file system. The agent can read and write files; you fetch them via signed URL after the task is done. Useful for "scrape this site, write a CSV, hand it back."
Next steps
- API Reference — every field.
- Authentication — keys, scopes, rotation.
- Vibecoding — how to feed all of this to your IDE.