Web Agent
Web Agent is the web action layer in SAK. It lets agents search, extract, operate webpages, track changes, and run controlled web tasks. After reading these docs, you can integrate Web Agent into your application, create a session, submit a task, and stream execution results.
Web Agent is not a traditional crawler SDK. It is designed for LLM agent task execution: developers provide an instruction, while Web Agent manages runtime resources, page state, retries, structured results, and the task lifecycle. The Console and SDKs are clients for the same API.
When to use Web Agent
- Your agent needs real-time web data instead of relying only on model training data or a fixed knowledge base.
- You need search, extraction, browser actions, and long-running tasks behind auditable APIs.
- You want the Console, SDKs, and backend services to share the same REST API contract.
- You need a session / task model for state, event streaming, or steps that require user confirmation.
When not to use Web Agent
- The task only needs your own backend APIs and does not need open web access.
- You need large-scale offline crawling, warehouse synchronization, or search-index construction.
- The target site's terms do not allow automated access and you do not have the required authorization.
- You have not defined API keys, project scope, task budget, and failure handling.
Core capabilities
| Capability | Description |
|---|---|
| DoAnything API | Provide a natural-language instruction. Web Agent chooses tools, runs steps, and returns a result. |
| Shaped APIs | Dedicated API contracts for artifact-shaped workflows such as DeepResearch, WebSearch, and Track. |
| Session / task model | A session owns runtime resources. A task represents one instruction or follow-up action. |
| Event stream | Subscribe to task state, output chunks, errors, and user-confirmation requests through SSE. |
| SDK and raw HTTP | Python, TypeScript, and cURL docs use the same API semantics. |
Documentation entry points
- What is WebAgent explains Web Agent's role, boundaries, and API shape.
- Quickstart runs the first task with Python, TypeScript, or cURL.
- Authentication & API keys explains
wa_keys, project scope, and rotation. - Sessions & Tasks explains the session, task, event, and profile lifecycle.
- Errors & Retries covers error codes, retry policy, and idempotency.
- API Reference covers base URL, auth, errors, rate limits, and pagination.
- Vibecoding shows how to give the docs and OpenAPI spec to an IDE-resident LLM.