RequestWithLock
Hierarchy
- Request
- RequestWithLock
Index
Methods
from_url
Create a new
Requestinstance from a URL.This is recommended constructor for creating new
Requestinstances. It generates aRequestobject from a given URL with additional options to customize HTTP method, payload, unique key, and other request properties. If nounique_keyoridis provided, they are computed automatically based on the URL, method and payload. It depends on thekeep_url_fragmentanduse_extended_unique_keyflags.Parameters
url: str
The URL of the request.
optionalkeyword-onlymethod: HttpMethod = 'GET'
The HTTP method of the request.
optionalkeyword-onlyheaders: HttpHeaders | dict[str, str] | None = None
The HTTP headers of the request.
optionalkeyword-onlypayload: HttpPayload | str | None = None
The data to be sent as the request body. Typically used with 'POST' or 'PUT' requests.
optionalkeyword-onlylabel: str | None = None
A custom label to differentiate between request types. This is stored in
user_data, and it is used for request routing (different requests go to different handlers).optionalkeyword-onlysession_id: str | None = None
ID of a specific
Sessionto which the request will be strictly bound. If the session becomes unavailable when the request is processed, aRequestCollisionErrorwill be raised.optionalkeyword-onlyunique_key: str | None = None
A unique key identifying the request. If not provided, it is automatically computed based on the URL and other parameters. Requests with the same
unique_keyare treated as identical.optionalkeyword-onlykeep_url_fragment: bool = False
Determines whether the URL fragment (e.g.,
`section`) should be included in theunique_keycomputation. This is only relevant whenunique_keyis not provided.optionalkeyword-onlyuse_extended_unique_key: bool = False
Determines whether to include the HTTP method, ID Session and payload in the
unique_keycomputation. This is only relevant whenunique_keyis not provided.optionalkeyword-onlyalways_enqueue: bool = False
If set to
True, the request will be enqueued even if it is already present in the queue. Using this is not allowed when a customunique_keyis also provided and will result in aValueError.kwargs: Any
Returns Self
get_query_param_from_url
Get the value of a specific query parameter from the URL.
Parameters
param: str
optionalkeyword-onlydefault: str | None = None
Returns str | None
Properties
crawl_depth
The depth of the request in the crawl tree.
crawlee_data
Crawlee-specific configuration stored in the user_data.
enqueue_strategy
The strategy that was used for enqueuing the request.
forefront
Indicate whether the request should be enqueued at the front of the queue.
handled_at
Timestamp when the request was handled.
label
A string used to differentiate between arbitrary request types.
last_proxy_tier
The last proxy tier used to process the request.
loaded_url
URL of the web page that was loaded. This can differ from the original URL in case of redirects.
lock_expires_at
The timestamp when the lock expires.
max_retries
Crawlee-specific limit on the number of retries of the request.
method
HTTP request method.
model_config
no_retry
If set to True, the request will not be retried in case of failure.
payload
HTTP request payload.
retry_count
Number of times the request has been retried.
session_id
The ID of the bound session, if there is any.
session_rotation_count
Crawlee-specific number of finished session rotations for the request.
state
Crawlee-specific request handling state.
unique_key
A unique key identifying the request. Two requests with the same unique_key are considered as pointing
to the same URL.
If unique_key is not provided, then it is automatically generated by normalizing the URL.
For example, the URL of HTTP://www.EXAMPLE.com/something/ will produce the unique_key
of http://www.example.com/something.
Pass an arbitrary non-empty text value to the unique_key property to override the default behavior
and specify which URLs shall be considered equal.
url
The URL of the web page to crawl. Must be a valid HTTP or HTTPS URL, and may include query parameters and fragments.
was_already_handled
Indicates whether the request was handled.
A crawling request with information about locks.