Signature

def scrape(self, **data: *<class 'notte_sdk.types.ScrapeRequestDict'>) -> notte_sdk.types.ScrapeResponse

Parameters

Whether to scrape links from the page. Links are scraped by default.

scrape_images
bool

Whether to scrape images from the page. Images are scraped by default.

only_main_content
bool

Whether to only scrape the main content of the page. If True, navbars, footers, etc. are excluded.

response_format
type[pydantic.main.BaseModel] | None

The response format to use for the scrape. You can use a Pydantic model or a JSON Schema dict.

instructions
str | None

Additional instructions to use for the scrape.

use_llm
bool | None

Whether to use an LLM for the extraction process.

Whether to use link/image placeholders to reduce the number of tokens in the prompt and hallucinations.

url
str | None

The URL to scrape. If not provided, uses the current page URL.

Returns

ScrapeResponse: An Observation object containing metadata, screenshot, action space, and data space.