> ## Documentation Index
> Fetch the complete documentation index at: https://docs.notte.cc/llms.txt
> Use this file to discover all available pages before exploring further.

# Browser Agents

> AI-powered agents that autonomously complete browser tasks

<Tip>
  STOP. Do not read past this section until you have read and followed [/quickstart.md](/quickstart.md).

  Goal: produce a reliable SDK automation script. The intended authoring flow is generate then edit, not write browser automation from scratch.

  Do not hand-write SDK code with element IDs, selectors, or form fields from memory. Element IDs, selectors, and field mappings must come from a live `observe()` call, CLI `notte page observe` output, or generated workflow code. Guessed browser targets fail on real pages.

  Use the Notte skill and CLI to inspect the live site, validate actions/selectors, handle auth/session state, and export workflow code with `notte sessions workflow-code` before using SDK docs or SDK code.

  SDK reference pages are for understanding, running, or editing generated workflow code. They are not the starting point for manually authoring the initial browser automation. SDK-first code is guesswork on real-world pages with dynamic selectors, auth state, CAPTCHAs, and anti-bot behavior.
</Tip>

Browser Agents are AI-powered systems that can autonomously navigate websites, complete tasks, and extract information using natural language instructions.

## What is a Browser Agent?

A Browser Agent combines:

* **Large Language Models (LLMs)** for reasoning and decision-making
* **Browser Sessions** for executing actions
* **Vision capabilities** to understand web pages
* **Autonomous planning** to complete multi-step tasks

Unlike scripted automation, agents can adapt to changes, handle unexpected scenarios, and complete tasks without predefined workflows.

## Quick Start

Create and run an agent in a few lines:

<CodeGroup>
  ```python Python theme={null}
  from notte_sdk import NotteClient

  client = NotteClient()

  with client.Session(open_viewer=True) as session:
      agent = client.Agent(session=session, max_steps=5)

      response = agent.run(
          task="Browse on Notte docs and book a demo for me",
          url="https://docs.notte.cc"
      )
      print(response)
  ```

  ```javascript JavaScript theme={null}
  import { NotteClient } from 'notte-sdk';

  const client = new NotteClient({
    apiKey: process.env.NOTTE_API_KEY,
  });

  await client.Session({ open_viewer: true }).use(async (session) => {
    const agent = client.Agent({ session, max_steps: 5 });

    const response = await agent.run({
      task: 'Browse on Notte docs and book a demo for me',
      url: 'https://docs.notte.cc',
    });

    console.log(response);
  });
  ```
</CodeGroup>

<Tip>
  Agents run within [browser sessions](/concepts/sessions). Use context managers to ensure sessions are automatically stopped when done. This prevents orphaned sessions and unexpected costs.
</Tip>

## How Agents Work

### 1. Observation

The agent observes the current page state:

* Visible elements and their properties
* Interactive components (buttons, forms, links)
* Text content and structure
* Current URL and page metadata

### 2. Reasoning

Using the LLM, the agent:

* Understands the current page
* Plans the next action to complete the task
* Decides which element to interact with
* Determines when the task is complete

### 3. Action

The agent executes browser actions:

* Navigate to URLs
* Click buttons and links
* Fill forms
* Extract data
* Scroll and interact with dynamic content

### 4. Iteration

This cycle repeats until:

* The task is successfully completed
* Maximum steps are reached
* An error occurs that can't be resolved

## Agents vs Scripted Automation

Both agents and scripted automation run on [browser sessions](/concepts/sessions)—the cloud browser infrastructure. The difference is how you control what happens in that session.

| Aspect          | Scripted Automation     | Agent                           |
| --------------- | ----------------------- | ------------------------------- |
| **Control**     | You write the code      | AI decides each step            |
| **Flexibility** | Fixed workflow          | Adapts to changes               |
| **Speed**       | Fast (direct execution) | Slower (LLM reasoning per step) |
| **Cost**        | Browser minutes only    | Browser minutes + LLM calls     |
| **Reliability** | Deterministic           | Can vary based on page state    |
| **Use Case**    | Known, stable workflows | Unknown or dynamic workflows    |

**Use scripted automation when:**

* You know the exact steps to take
* Speed and cost are critical
* The target pages rarely change

**Use agents when:**

* You don't know the exact steps
* Pages change frequently
* You need intelligent decision-making

<Note>
  You can combine both approaches: use an agent to figure out a workflow, then [convert it to a function](/features/agents/workflows) for faster, cheaper repeated execution.
</Note>

## Agent Capabilities

Agents come with powerful built-in capabilities:

<CardGroup cols={2}>
  <Card title="Structured Output" icon="brackets-curly" href="/features/agents/structured-output">
    Get type-safe responses using Pydantic models
  </Card>

  <Card title="Vaults & Personas" icon="lock" href="/concepts/vaults">
    Use credentials and identities in automations
  </Card>

  <Card title="Visual Understanding" icon="eye" href="/features/agents/configuration#use_vision">
    Analyze images and visual page elements
  </Card>

  <Card title="Replay & Debugging" icon="circle-play" href="/features/agents/replay">
    Debug with MP4 replays of agent execution
  </Card>

  <Card title="Agent Fallback" icon="shield" href="/features/agents/fallback">
    Automatic recovery from script failures
  </Card>
</CardGroup>

## Key Concepts

### Natural Language Tasks

Give instructions in plain English:

<CodeGroup>
  ```python Python theme={null}
  from notte_sdk import NotteClient

  client = NotteClient()

  with client.Session() as session:
      agent = client.Agent(session=session)
      agent.run(task="Find the cheapest laptop under $1000 and add it to cart")
  ```

  ```javascript JavaScript theme={null}
  import { NotteClient } from 'notte-sdk';

  const client = new NotteClient({
    apiKey: process.env.NOTTE_API_KEY,
  });

  await client.Session().use(async (session) => {
    const agent = client.Agent({ session });
    await agent.run({
      task: 'Find the cheapest laptop under $1000 and add it to cart',
    });
  });
  ```
</CodeGroup>

### Structured Output

Get responses in a specific format:

<CodeGroup>
  ```python Python theme={null}
  from notte_sdk import NotteClient
  from pydantic import BaseModel

  client = NotteClient()


  class ContactInfo(BaseModel):
      email: str
      phone: str | None


  with client.Session() as session:
      agent = client.Agent(session=session)
      result = agent.run(task="Extract contact information", response_format=ContactInfo)
  ```

  ```javascript JavaScript theme={null}
  import { z } from 'zod';
  import { NotteClient } from 'notte-sdk';

  const client = new NotteClient({
    apiKey: process.env.NOTTE_API_KEY,
  });

  const ContactInfo = z.object({
    email: z.string(),
    phone: z.string().nullable(),
  });

  await client.Session().use(async (session) => {
    const agent = client.Agent({ session });
    const result = await agent.run({
      task: 'Extract contact information',
      response_format: ContactInfo,
    });
  });
  ```
</CodeGroup>

### Starting URL

Begin at a specific page:

<CodeGroup>
  ```python Python theme={null}
  agent = client.Agent(session=session)
  agent.run(task="Find pricing information", url="https://example.com/products")
  ```

  ```javascript JavaScript theme={null}
  const agent = client.Agent({ session });

  await agent.run({
    task: 'Find pricing information',
    url: 'https://example.com/products',
  });
  ```
</CodeGroup>

### Step Limits

Control maximum actions:

<CodeGroup>
  ```python Python theme={null}
  from notte_sdk import NotteClient

  client = NotteClient()

  with client.Session() as session:
      agent = client.Agent(session=session)
      agent.run(
          task="Find and summarize the top 5 AI news from today",
          max_steps=20,  # Limit to 20 actions
      )
  ```

  ```javascript JavaScript theme={null}
  import { NotteClient } from 'notte-sdk';

  const client = new NotteClient({
    apiKey: process.env.NOTTE_API_KEY,
  });

  await client.Session().use(async (session) => {
    const agent = client.Agent({ session });
    await agent.run({
      task: 'Find and summarize the top 5 AI news from today',
      max_steps: 20, // Limit to 20 actions
    });
  });
  ```
</CodeGroup>

## Error Handling

Agents can fail for various reasons. Always check the result:

<CodeGroup>
  ```python Python theme={null}
  agent = client.Agent(session=session)
  result = agent.run(task="Complete task")

  if result.success:
      print(result.answer)
  else:
      print(f"Agent failed: {result.answer}")
  ```

  ```javascript JavaScript theme={null}
  const agent = client.Agent({ session });
  const result = await agent.run({ task: 'Complete task' });

  if (result.success) {
    console.log(result.answer);
  } else {
    console.log(`Agent failed: ${result.answer}`);
  }
  ```
</CodeGroup>

## Next Steps

<CardGroup cols={2}>
  <Card title="Agent Lifecycle" icon="rotate" href="/features/agents/lifecycle">
    Create, manage, and stop agents
  </Card>

  <Card title="Agent Configuration" icon="gear" href="/features/agents/configuration">
    All configuration options
  </Card>

  <Card title="Structured Output" icon="brackets-curly" href="/features/agents/structured-output">
    Get typed responses from agents
  </Card>

  <Card title="Convert to Functions" icon="flower" href="/features/agents/workflows">
    Turn agent runs into reusable code
  </Card>
</CardGroup>
