Version: 3.15

Request <UserData>

Represents a URL to be crawled, optionally including HTTP method, headers, payload and other metadata. The Request object also stores information about errors that occurred during processing of the request.

Each Request instance has the uniqueKey property, which can be either specified manually in the constructor or generated automatically from the URL. Two requests with the same uniqueKey are considered as pointing to the same web resource. This behavior applies to all Crawlee classes, such as RequestList, RequestQueue, PuppeteerCrawler or PlaywrightCrawler.

To access and examine the actual request sent over http, with all autofilled headers you can access response.request object from the request handler

Example use:

const request = new Request({
    url: 'http://www.example.com',
    headers: { Accept: 'application/json' },
});

...

request.userData.foo = 'bar';
request.pushErrorMessage(new Error('Request failed!'));

...

const foo = request.userData.foo;

Index

Constructors

constructor

new Request<UserData>(options): Request<UserData>

Request parameters including the URL, HTTP method and headers, and others.
Parameters
- options: RequestOptions<UserData>
Returns Request<UserData>

Properties

errorMessages

errorMessages: string[]

An array of error messages from request processing.

optionalhandledAt

handledAt?: string

ISO datetime string that indicates the time when the request has been processed. Is null if the request has not been crawled yet.

optionalheaders

headers?: Record<string, string>

Object with HTTP headers. Key is header name, value is the value.

optionalid

id?: string

Request ID

optionalloadedUrl

loadedUrl?: string

An actually loaded URL after redirects, if present. HTTP redirects are guaranteed to be included.

When using PuppeteerCrawler or PlaywrightCrawler, meta tag and JavaScript redirects may, or may not be included, depending on their nature. This generally means that redirects, which happen immediately will most likely be included, but delayed redirects will not.

method

method: AllowedHttpMethods

HTTP method, e.g. GET or POST.

noRetry

noRetry: boolean

The true value indicates that the request will not be automatically retried on error.

optionalpayload

payload?: string

HTTP request payload, e.g. for POST requests.

retryCount

retryCount: number

Indicates the number of times the crawling of the request has been retried on error.

uniqueKey

uniqueKey: string

A unique key identifying the request. Two requests with the same uniqueKey are considered as pointing to the same URL.

url

url: string

URL of the web page to crawl.

userData

userData: UserData = ...

Custom user data assigned to the request.

Accessors

crawlDepth

get crawlDepth(): number
set crawlDepth(value): void

Depth of the request in the current crawl tree. Note that this is dependent on the crawler setup and might produce unexpected results when used with multiple crawlers.
Returns number
Depth of the request in the current crawl tree. Note that this is dependent on the crawler setup and might produce unexpected results when used with multiple crawlers.
Parameters
- value: number
Returns void

label

get label(): undefined | string
set label(value): void

shortcut for getting request.userData.label
Returns undefined | string
shortcut for setting request.userData.label
Parameters
- value: undefined | string
Returns void

maxRetries

get maxRetries(): undefined | number
set maxRetries(value): void

Maximum number of retries for this request. Allows to override the global maxRequestRetries option of BasicCrawler.
Returns undefined | number
Maximum number of retries for this request. Allows to override the global maxRequestRetries option of BasicCrawler.
Parameters
- value: undefined | number
Returns void

sessionRotationCount

get sessionRotationCount(): number
set sessionRotationCount(value): void

Indicates the number of times the crawling of the request has rotated the session due to a session or a proxy error.
Returns number
Indicates the number of times the crawling of the request has rotated the session due to a session or a proxy error.
Parameters
- value: number
Returns void

skipNavigation

get skipNavigation(): boolean
set skipNavigation(value): void

Tells the crawler processing this request to skip the navigation and process the request directly.
Returns boolean
Tells the crawler processing this request to skip the navigation and process the request directly.
Parameters
- value: boolean
Returns void

state

get state(): RequestState
set state(value): void

Describes the request's current lifecycle state.
Returns RequestState
Describes the request's current lifecycle state.
Parameters
- value: RequestState
Returns void

Methods

pushErrorMessage

pushErrorMessage(errorOrMessage, options): void

Stores information about an error that occurred during processing of this request.

You should always use Error instances when throwing errors in JavaScript.

Nevertheless, to improve the debugging experience when using third party libraries that may not always throw an Error instance, the function performs a type inspection of the passed argument and attempts to extract as much information as possible, since just throwing a bad type error makes any debugging rather difficult.
Parameters
- errorOrMessage: unknown
  Error object or error message to be stored in the request.
- optionaloptions: PushErrorMessageOptions = {}
Returns void

Index

Constructors

Properties

Accessors

Methods

Constructors

constructor

Parameters

options: RequestOptions<UserData>

Returns Request<UserData>

Properties

errorMessages

optionalhandledAt

optionalheaders

optionalid

optionalloadedUrl

method

noRetry

optionalpayload

retryCount

uniqueKey

url

userData

Accessors

crawlDepth

Returns number

Parameters

value: number

Returns void

label

Returns undefined | string

Parameters

value: undefined | string

Returns void

maxRetries

Returns undefined | number

Parameters

value: undefined | number

Returns void

sessionRotationCount

Returns number

Parameters

value: number

Returns void

skipNavigation

Returns boolean

Parameters

value: boolean

Returns void

state

Returns RequestState

Parameters

value: RequestState

Returns void

Methods

pushErrorMessage

Parameters

errorOrMessage: unknown

optionaloptions: PushErrorMessageOptions = {}

Returns void