Skip to content
Go to Dashboard

What is WebAgent

WebAgent is an API that lets LLM agents act on the web like developers do. You give it an instruction in English, it picks the tools (a browser, a sandbox, a search engine), runs the steps, and returns a result.

What you get

WebAgent ships user-facing API products in two categories:

  • DoAnything API — open-ended; free-form input, the agent picks the path. Session/task resource model + 7-state lifecycle + long-running tasks.
  • Shaped APIs — when you know the artifact shape you want, use the dedicated API for a shaped contract + quality guarantees:
    • DeepResearch — research → report (final.md + citations + confidence).
    • WebSearch — query → results (structured search results + optional summary).
    • Track — monitor → snapshot stream + change notifications.

Shared capabilities:

  • Profiles — reusable login state across sessions. No re-logging in every run.
  • Workspaces — a persistent file system the agent can read and write to.
  • Schedules — cron, interval, event-triggered, or autonomous (the agent decides when next to run).
  • SSE event stream — the same task.* events that drive the Console, streamed directly to your code.

What it is not

  • Not a low-code automation builder. There is no canvas. You wire up tasks in code (or via the Console as a prototyping aid).
  • Not a hosted LLM API. Bring your task; WebAgent picks an LLM and pays the bill on a credits model.

Three product surfaces: Console / OpenAPI / SDK

All APIs are exposed through the same three surfaces, with 1:1 capability parity and a shared resource layer / event stream / billing:

You can use …… to do
The REST API (OpenAPI)Anything. Console and SDKs are just clients. api.web-agent.asix.inc/v1/... + Authorization: Bearer wa_...
Python or TypeScript SDKSame surface, idiomatic types, retries, streaming, wait_for_done.
The ConsolePrototype tasks visually; non-developers welcome; Get Code dialog hands you working snippets.

API-developer-first — the product is the API. Console is a convenience layer, not a separate product surface; no Console-only privileged endpoints.

Mental model

Session                            (one container; holds a browser, profile, workspace)
└── Task #1  status: completed     (one instruction; lifecycle has 7 states)
└── Task #2  status: running       (a follow-up instruction in the same session)
    └── events: SSE stream         (status_changed, message, action.*, screenshot, …)

A session owns the runtime resources (browser, profile, workspace). Each task is one instruction; you can submit follow-up tasks against the same session and they share state. The task lifecycle has seven states (pending, running, awaiting_input, paused, done, failed, canceled); see Sessions & Tasks.

The Standalone API (DeepResearch) uses the same task lifecycle but doesn't expose sessions — it's a one-shot artifact that doesn't need cross-task browser reuse.

Next steps