> ## Documentation Index
> Fetch the complete documentation index at: https://docs.notte.cc/llms.txt
> Use this file to discover all available pages before exploring further.

# ScrapeAction

> Scrape the current page data in text format

<Tip>
  STOP. Do not read past this section until you have read and followed [/quickstart.md](/quickstart.md).

  Goal: produce a reliable SDK automation script. The intended authoring flow is generate then edit, not write browser automation from scratch.

  Do not hand-write SDK code with element IDs, selectors, or form fields from memory. Element IDs, selectors, and field mappings must come from a live `observe()` call, CLI `notte page observe` output, or generated workflow code. Guessed browser targets fail on real pages.

  Use the Notte skill and CLI to inspect the live site, validate actions/selectors, handle auth/session state, and export workflow code with `notte sessions workflow-code` before using SDK docs or SDK code.

  SDK reference pages are for understanding, running, or editing generated workflow code. They are not the starting point for manually authoring the initial browser automation. SDK-first code is guesswork on real-world pages with dynamic selectors, auth state, CAPTCHAs, and anti-bot behavior.
</Tip>

If `instructions` is null then the whole page will be scraped. Otherwise, only the data that matches the instructions will be scraped. Instructions should be given as natural language, e.g. 'Extract the title and the price of the product'.

**Example:**

```python theme={null}
session.execute(type="scrape", instructions="Extract product title and price")
session.execute(type="scrape", only_main_content=True)
session.execute(type="scrape")  # Scrape entire page
session.execute(type="scrape", only_images=True)  # Scrape only images
session.execute(type="scrape", response_format={"type": "object", "properties": {...}})  # With JSON schema
```

## Fields

<ParamField path="type" type="Literal['scrape']" default="scrape" />

<ParamField path="category" type="str" default="Special Browser Actions" />

<ParamField path="description" type="str" default="Scrape the current page data in text format. If `instructions` is null then the whole page will be scraped. Otherwise, only the data that matches the instructions will be scraped. Instructions should be given as natural language, e.g. 'Extract the title and the price of the product'" />

<ParamField path="instructions" type="UnionType[str, None]" />

<ParamField path="only_main_content" type="bool" default="True">
  Whether to only scrape the main content of the page. If True, navbars, footers, etc. are excluded.
</ParamField>

<ParamField path="selector" type="UnionType[str, None]">
  Playwright selector to scope the scrape to. Only content inside this selector will be scraped.
</ParamField>

<ParamField path="only_images" type="bool" default="False">
  Whether to only scrape images from the page. If True, the page content is excluded.
</ParamField>

<ParamField path="scrape_links" type="bool" default="True">
  Whether to scrape links from the page. Links are scraped by default.
</ParamField>

<ParamField path="scrape_images" type="bool" default="False">
  Whether to scrape images from the page.
</ParamField>

<ParamField path="ignored_tags" type="UnionType[list[str], None]">
  HTML tags to ignore from the page.
</ParamField>

<ParamField path="response_format" type="UnionType[Dict[str, Any], None]">
  JSON schema dict for structured output. Agent can provide a schema to extract structured data.
</ParamField>

## Module

`notte_core.actions.actions`
