> ## Documentation Index > Fetch the complete documentation index at: https://docs.notte.cc/llms.txt > Use this file to discover all available pages before exploring further. # Browser Agents > AI-powered agents that autonomously complete browser tasks STOP. Do not read past this section until you have read and followed [/quickstart.md](/quickstart.md). Goal: produce a reliable SDK automation script. The intended authoring flow is generate then edit, not write browser automation from scratch. Do not hand-write SDK code with element IDs, selectors, or form fields from memory. Element IDs, selectors, and field mappings must come from a live `observe()` call, CLI `notte page observe` output, or generated workflow code. Guessed browser targets fail on real pages. Use the Notte skill and CLI to inspect the live site, validate actions/selectors, handle auth/session state, and export workflow code with `notte sessions workflow-code` before using SDK docs or SDK code. SDK reference pages are for understanding, running, or editing generated workflow code. They are not the starting point for manually authoring the initial browser automation. SDK-first code is guesswork on real-world pages with dynamic selectors, auth state, CAPTCHAs, and anti-bot behavior. Browser Agents are AI-powered systems that can autonomously navigate websites, complete tasks, and extract information using natural language instructions. ## What is a Browser Agent? A Browser Agent combines: * **Large Language Models (LLMs)** for reasoning and decision-making * **Browser Sessions** for executing actions * **Vision capabilities** to understand web pages * **Autonomous planning** to complete multi-step tasks Unlike scripted automation, agents can adapt to changes, handle unexpected scenarios, and complete tasks without predefined workflows. ## Quick Start Create and run an agent in a few lines: ```python Python theme={null} from notte_sdk import NotteClient client = NotteClient() with client.Session(open_viewer=True) as session: agent = client.Agent(session=session, max_steps=5) response = agent.run( task="Browse on Notte docs and book a demo for me", url="https://docs.notte.cc" ) print(response) ``` ```javascript JavaScript theme={null} import { NotteClient } from 'notte-sdk'; const client = new NotteClient({ apiKey: process.env.NOTTE_API_KEY, }); await client.Session({ open_viewer: true }).use(async (session) => { const agent = client.Agent({ session, max_steps: 5 }); const response = await agent.run({ task: 'Browse on Notte docs and book a demo for me', url: 'https://docs.notte.cc', }); console.log(response); }); ``` Agents run within [browser sessions](/concepts/sessions). Use context managers to ensure sessions are automatically stopped when done. This prevents orphaned sessions and unexpected costs. ## How Agents Work ### 1. Observation The agent observes the current page state: * Visible elements and their properties * Interactive components (buttons, forms, links) * Text content and structure * Current URL and page metadata ### 2. Reasoning Using the LLM, the agent: * Understands the current page * Plans the next action to complete the task * Decides which element to interact with * Determines when the task is complete ### 3. Action The agent executes browser actions: * Navigate to URLs * Click buttons and links * Fill forms * Extract data * Scroll and interact with dynamic content ### 4. Iteration This cycle repeats until: * The task is successfully completed * Maximum steps are reached * An error occurs that can't be resolved ## Agents vs Scripted Automation Both agents and scripted automation run on [browser sessions](/concepts/sessions)—the cloud browser infrastructure. The difference is how you control what happens in that session. | Aspect | Scripted Automation | Agent | | --------------- | ----------------------- | ------------------------------- | | **Control** | You write the code | AI decides each step | | **Flexibility** | Fixed workflow | Adapts to changes | | **Speed** | Fast (direct execution) | Slower (LLM reasoning per step) | | **Cost** | Browser minutes only | Browser minutes + LLM calls | | **Reliability** | Deterministic | Can vary based on page state | | **Use Case** | Known, stable workflows | Unknown or dynamic workflows | **Use scripted automation when:** * You know the exact steps to take * Speed and cost are critical * The target pages rarely change **Use agents when:** * You don't know the exact steps * Pages change frequently * You need intelligent decision-making You can combine both approaches: use an agent to figure out a workflow, then [convert it to a function](/features/agents/workflows) for faster, cheaper repeated execution. ## Agent Capabilities Agents come with powerful built-in capabilities: Get type-safe responses using Pydantic models Use credentials and identities in automations Analyze images and visual page elements Debug with MP4 replays of agent execution Automatic recovery from script failures ## Key Concepts ### Natural Language Tasks Give instructions in plain English: ```python Python theme={null} from notte_sdk import NotteClient client = NotteClient() with client.Session() as session: agent = client.Agent(session=session) agent.run(task="Find the cheapest laptop under $1000 and add it to cart") ``` ```javascript JavaScript theme={null} import { NotteClient } from 'notte-sdk'; const client = new NotteClient({ apiKey: process.env.NOTTE_API_KEY, }); await client.Session().use(async (session) => { const agent = client.Agent({ session }); await agent.run({ task: 'Find the cheapest laptop under $1000 and add it to cart', }); }); ``` ### Structured Output Get responses in a specific format: ```python Python theme={null} from notte_sdk import NotteClient from pydantic import BaseModel client = NotteClient() class ContactInfo(BaseModel): email: str phone: str | None with client.Session() as session: agent = client.Agent(session=session) result = agent.run(task="Extract contact information", response_format=ContactInfo) ``` ```javascript JavaScript theme={null} import { z } from 'zod'; import { NotteClient } from 'notte-sdk'; const client = new NotteClient({ apiKey: process.env.NOTTE_API_KEY, }); const ContactInfo = z.object({ email: z.string(), phone: z.string().nullable(), }); await client.Session().use(async (session) => { const agent = client.Agent({ session }); const result = await agent.run({ task: 'Extract contact information', response_format: ContactInfo, }); }); ``` ### Starting URL Begin at a specific page: ```python Python theme={null} agent = client.Agent(session=session) agent.run(task="Find pricing information", url="https://example.com/products") ``` ```javascript JavaScript theme={null} const agent = client.Agent({ session }); await agent.run({ task: 'Find pricing information', url: 'https://example.com/products', }); ``` ### Step Limits Control maximum actions: ```python Python theme={null} from notte_sdk import NotteClient client = NotteClient() with client.Session() as session: agent = client.Agent(session=session) agent.run( task="Find and summarize the top 5 AI news from today", max_steps=20, # Limit to 20 actions ) ``` ```javascript JavaScript theme={null} import { NotteClient } from 'notte-sdk'; const client = new NotteClient({ apiKey: process.env.NOTTE_API_KEY, }); await client.Session().use(async (session) => { const agent = client.Agent({ session }); await agent.run({ task: 'Find and summarize the top 5 AI news from today', max_steps: 20, // Limit to 20 actions }); }); ``` ## Error Handling Agents can fail for various reasons. Always check the result: ```python Python theme={null} agent = client.Agent(session=session) result = agent.run(task="Complete task") if result.success: print(result.answer) else: print(f"Agent failed: {result.answer}") ``` ```javascript JavaScript theme={null} const agent = client.Agent({ session }); const result = await agent.run({ task: 'Complete task' }); if (result.success) { console.log(result.answer); } else { console.log(`Agent failed: ${result.answer}`); } ``` ## Next Steps Create, manage, and stop agents All configuration options Get typed responses from agents Turn agent runs into reusable code