BaseRequestData
Hierarchy
- BaseRequestData
Index
Methods
from_url
Create a new
BaseRequestData
instance from a URL. SeeRequest.from_url
for more details.Parameters
optionalkeyword-onlyurl: str
optionalkeyword-onlymethod: HttpMethod = 'GET'
optionalkeyword-onlyheaders: (HttpHeaders | dict[str, str]) | None = None
optionalkeyword-onlypayload: (HttpPayload | str) | None = None
optionalkeyword-onlylabel: str | None = None
optionalkeyword-onlyunique_key: str | None = None
optionalkeyword-onlykeep_url_fragment: bool = False
optionalkeyword-onlyuse_extended_unique_key: bool = False
optionalkeyword-onlykwargs: Any
Returns Self
get_query_param_from_url
Get the value of a specific query parameter from the URL.
Parameters
optionalkeyword-onlyparam: str
optionalkeyword-onlydefault: str | None = None
Returns str | None
Properties
handled_at
Timestamp when the request was handled.
headers
HTTP request headers.
loaded_url
URL of the web page that was loaded. This can differ from the original URL in case of redirects.
method
HTTP request method.
model_config
no_retry
If set to True
, the request will not be retried in case of failure.
payload
HTTP request payload.
TODO: Re-check the need for Validator
and Serializer
once the issue is resolved.
https://github.com/apify/crawlee-python/issues/94
retry_count
Number of times the request has been retried.
unique_key
A unique key identifying the request. Two requests with the same unique_key
are considered as pointing
to the same URL.
If unique_key
is not provided, then it is automatically generated by normalizing the URL.
For example, the URL of HTTP://www.EXAMPLE.com/something/
will produce the unique_key
of http://www.example.com/something
.
Pass an arbitrary non-empty text value to the unique_key
property
to override the default behavior and specify which URLs shall be considered equal.
url
The URL of the web page to crawl. Must be a valid HTTP or HTTPS URL, and may include query parameters and fragments.
user_data
Custom user data assigned to the request. Use this to save any request related data to the request's scope, keeping them accessible on retries, failures etc.
Data needed to create a new crawling request.