RequestLoader
Hierarchy
- RequestLoader
Index
Methods
fetch_next_request
Return the next request to be processed, or
null
if there are no more pending requests.Returns Request | None
get_handled_count
Return the number of handled requests.
Returns int
get_total_count
Return an offline approximation of the total number of requests in the source (i.e. pending + handled).
Returns int
is_empty
Return True if there are no more requests in the source (there might still be unfinished requests).
Returns bool
is_finished
Return True if all requests have been handled.
Returns bool
mark_request_as_handled
Marks a request as handled after a successful processing (or after giving up retrying).
Parameters
optionalkeyword-onlyrequest: Request
Returns ProcessedRequest | None
to_tandem
Combine the loader with a request manager to support adding and reclaiming requests.
Parameters
optionalkeyword-onlyrequest_manager: RequestManager | None = None
Request manager to combine the loader with. If None is given, the default request queue is used.
Returns RequestManagerTandem
Abstract base class defining the interface for classes that provide access to a read-only stream of requests.
Request loaders are used to manage and provide access to a storage of crawling requests.
Key responsibilities: