ScrapeAction

If instructions is null then the whole page will be scraped. Otherwise, only the data that matches the instructions will be scraped. Instructions should be given as natural language, e.g. ‘Extract the title and the price of the product’. Example:

session.execute(type="scrape", instructions="Extract product title and price")
session.execute(type="scrape", only_main_content=True)
session.execute(type="scrape")  # Scrape entire page
session.execute(type="scrape", only_images=True)  # Scrape only images
session.execute(type="scrape", response_format={"type": "object", "properties": {...}})  # With JSON schema

Fields

type

Literal['scrape']

default:"scrape"

category

str

default:"Special Browser Actions"

description

str

instructions

UnionType[str, None]

only_main_content

bool

default:"True"

Whether to only scrape the main content of the page. If True, navbars, footers, etc. are excluded.

selector

UnionType[str, None]

Playwright selector to scope the scrape to. Only content inside this selector will be scraped.

only_images

bool

default:"False"

Whether to only scrape images from the page. If True, the page content is excluded.

scrape_links

bool

default:"True"

Whether to scrape links from the page. Links are scraped by default.

scrape_images

bool

default:"False"

Whether to scrape images from the page.

ignored_tags

UnionType[list[str], None]

HTML tags to ignore from the page.

response_format

UnionType[Dict[str, Any], None]

JSON schema dict for structured output. Agent can provide a schema to extract structured data.

Module

notte_core.actions.actions

Getting Started

Core Features

Tooling

Debug

ScrapeAction

Fields

Module

Getting Started

Core Features

Tooling

Debug

​Fields

​Module

Fields

Module