Skip to main content

RequestProvider

crawlee.storages.request_provider.RequestProvider

Provides access to a queue of crawling requests.

Index

Methods

add_requests_batched

  • async add_requests_batched(requests, *, batch_size, wait_time_between_batches, wait_for_all_requests_to_be_added, wait_for_all_requests_to_be_added_timeout): None
  • Add requests to the underlying resource client in batches.


    Parameters

    • requests: Sequence[str | BaseRequestData | Request]
    • batch_size: int = 1000keyword-only
    • wait_time_between_batches: timedelta = timedelta(seconds=1)keyword-only
    • wait_for_all_requests_to_be_added: bool = Falsekeyword-only
    • wait_for_all_requests_to_be_added_timeout: timedelta | None = Nonekeyword-only

    Returns None

drop

  • async drop(): None
  • Removes the queue either from the Apify Cloud storage or from the local database.


    Returns None

fetch_next_request

  • async fetch_next_request(): Request | None
  • Returns Request | None

get_handled_count

  • async get_handled_count(): int
  • Returns int

get_total_count

  • async get_total_count(): int
  • Returns int

is_empty

  • async is_empty(): bool
  • Returns bool

is_finished

  • async is_finished(): bool
  • Returns bool

mark_request_as_handled

  • async mark_request_as_handled(request): ProcessedRequest | None
  • Marks a request as handled after a successful processing (or after giving up retrying).


    Parameters

    • request: Request

    Returns ProcessedRequest | None

reclaim_request

  • async reclaim_request(request, *, forefront): ProcessedRequest | None
  • Reclaims a failed request back to the queue, so that it can be returned for processing later again.

    It is possible to modify the request data by supplying an updated request as a parameter.


    Parameters

    • request: Request
    • forefront: bool = Falsekeyword-only

    Returns ProcessedRequest | None

Properties

name

name: str | None

ID or name of the request queue.