Sessions & Tasks

A session is one runtime container — it owns the browser, the profile (cookies / login state), and the workspace (file system). A task is one instruction that runs inside a session. You can fire follow-up tasks at the same session; they share the state the previous task left behind.

Resource shape

text

project
└── session                   id: sess_…
    ├── browser, profile, workspace
    └── task                  id: task_…
        ├── instructions      "Search Hacker News..."
        ├── status            running | done | …
        └── events (SSE)      task.status_changed, task.message, …

Lifecycle (seven states)

mermaid

stateDiagram-v2
    [*] --> pending
    pending --> running
    running --> awaiting_input : task.input_request
    awaiting_input --> running : POST /intervene
    running --> paused : POST /pause
    paused --> running : POST /resume
    running --> done
    running --> failed
    pending --> canceled
    running --> canceled : POST /cancel
    awaiting_input --> canceled
    paused --> canceled

State	Meaning	Next moves
`pending`	Accepted, queued for an agent slot	→ `running`, `canceled`
`running`	Agent is actively working	→ `done`, `failed`, `awaiting_input`, `paused`, `canceled`
`awaiting_input`	Agent paused itself; needs you to answer	→ `running` (via `intervene`)
`paused`	You paused it (manual)	→ `running` (via `resume`), `canceled`
`done`	Completed successfully; `output` populated	terminal
`failed`	Hit an error; `error.code` and `error.detail` populated	terminal
`canceled`	You canceled (or scheduled max-duration tripped)	terminal

A task in any non-terminal state holds session resources. Cap with max_duration_minutes to bound that.

Submitting a task

python

from web_agent import Client
from web_agent.v1.types import CreateSessionRequest, RecordingConfigRequest

session = await client.sessions.create(CreateSessionRequest(
    instructions="Find the top 5 Show HN posts from the last 24 hours.",
    model="claude-sonnet-4.6",
    max_cost_usd="0.50",
    max_duration_minutes=10,
    recording=RecordingConfigRequest(enabled=True),
    keep_alive=True,
))
task = session.tasks[0]

Field	Type	Notes
`instructions`	string	The task in plain English. Up to 10 000 chars.
`model`	string	Model id, e.g. `claude-sonnet-4.6`, `gemini-3-flash`.
`max_cost_usd`	string	Decimal-as-string. Hard cap.
`max_duration_minutes`	int	1–10 080 (one week).
`recording`	object	`{enabled, quality, capture_during_take_control}`; omit for off.
`keep_alive`	bool	When the task ends, keep the session warm for follow-up tasks.
`allowed_actions`	string[]	Whitelist of tool actions the agent may call. Empty = all allowed.
`profile_id`	string	Reuse cookies/auth from a saved profile.

The full schema is in the OpenAPI spec.

Follow-up tasks

python

from web_agent.v1.types import CreateTaskRequest

followup = await client.sessions.create_task(
    session.id,
    CreateTaskRequest(
        instructions="Now click into the first post and summarise the discussion.",
    ),
)

The follow-up runs in the same browser, with the same cookies, against the same DOM the previous task left.

Events

Every task emits a Server-Sent Events stream:

http

GET /v1/projects/{pid}/do_anything/sessions/{sid}/tasks/{tid}/events
Authorization: Bearer wa_…

Eleven event types (the envelope is the same; data shape varies):

Type	When
`task.status_changed`	State transition
`task.message`	Agent or user message in the chat thread
`task.action.started`	Agent invoked a tool
`task.action.completed`	Tool returned
`task.action.failed`	Tool threw
`task.screenshot`	New browser frame (`url` is short-lived)
`task.input_request`	Agent paused; needs you to answer
`task.input_request_resolved`	Your `intervene` was accepted
`task.cost_update`	Per-step cost delta
`task.completed`	Terminal; `output` populated
`stream.heartbeat`	Every ~15 s; harmless

Reconnect cleanly:

http

GET …/events
Last-Event-ID: 142

The server replays events with id > 142 so you don't miss anything.

Input request (human in the loop)

When the agent hits a captcha, a 2FA prompt, or any judgment call, it emits task.input_request:

json

{
  "type": "task.input_request",
  "data": {
    "input_request_id": "ir_01HXX…",
    "prompt": "I see a 'Verify you're human' challenge. Solve it for me?",
    "schema": { "type": "object", "properties": { "solved": { "type": "boolean" } } }
  }
}

You answer via POST /intervene:

python

await client.messages.intervene(
    session.id, task.id,
    input_request_id="ir_01HXX…",
    response={"solved": True},
)

The task transitions back to running. The whole cycle is one round-trip; no polling.

Profiles

A profile is a reusable browser identity: cookies, local storage, auth state. Reference one when creating a session:

python

await client.sessions.create(CreateSessionRequest(
    instructions="Open my LinkedIn inbox and reply to the latest message.",
    profile_id="prof_linkedin_main",
))

The first time, set up the profile manually in the Console (sign in, accept cookies, do whatever). Future sessions reuse it.

Workspaces

A workspace is a persistent file system. The agent can read and write files; you fetch them via signed URL after the task is done. Useful for "scrape this site, write a CSV, hand it back."

Next steps

API Reference — every field.
Authentication — keys, scopes, rotation.
Vibecoding — how to feed all of this to your IDE.

Sessions & Tasks ​

Resource shape ​

Lifecycle (seven states) ​

Submitting a task ​

Follow-up tasks ​

Events ​

Input request (human in the loop) ​

Profiles ​

Workspaces ​

Next steps ​

Sessions & Tasks

Resource shape

Lifecycle (seven states)

Submitting a task

Follow-up tasks

Events

Input request (human in the loop)

Profiles

Workspaces

Next steps