RequestLoader
Hierarchy
- RequestLoader
Index
Methods
fetch_next_request
Returns the next request to be processed, or
null
if there are no more pending requests.Returns Request | None
get_handled_count
Returns the number of handled requests.
Returns int
get_total_count
Returns an offline approximation of the total number of requests in the source (i.e. pending + handled).
Returns int
is_empty
Returns True if there are no more requests in the source (there might still be unfinished requests).
Returns bool
is_finished
Returns True if all requests have been handled.
Returns bool
mark_request_as_handled
Marks a request as handled after a successful processing (or after giving up retrying).
Parameters
optionalkeyword-onlyrequest: Request
Returns ProcessedRequest | None
to_tandem
Combine the loader with a request manager to support adding and reclaiming requests.
Parameters
optionalkeyword-onlyrequest_manager: RequestManager | None = None
Request manager to combine the loader with. If None is given, the default request queue is used.
Returns RequestManagerTandem
Abstract base class defining the interface for classes that provide access to a read-only stream of requests.
Request loaders are used to manage and provide access to a storage of crawling requests.
Key responsibilities: