Skip to main content

RequestManagerTandem

Implements a tandem behaviour for a pair of RequestLoader and RequestManager.

In this scenario, the contents of the "loader" get transferred into the "manager", allowing processing the requests from both sources and also enqueueing new requests (not possible with plain RequestManager).

Hierarchy

Index

Methods

__init__

  • __init__(*, request_loader, request_manager): None

add_request

  • Add a single request to the manager and store it in underlying resource client.


    Parameters

    • optionalkeyword-onlyrequest: str | Request

      The request object (or its string representation) to be added to the manager.

    • optionalkeyword-onlyforefront: bool = False

      Determines whether the request should be added to the beginning (if True) or the end (if False) of the manager.

    Returns ProcessedRequest

add_requests_batched

  • async add_requests_batched(*, requests, batch_size, wait_time_between_batches, wait_for_all_requests_to_be_added, wait_for_all_requests_to_be_added_timeout): None
  • Add requests to the manager in batches.


    Parameters

    • optionalkeyword-onlyrequests: Sequence[str | Request]

      Requests to enqueue.

    • optionalkeyword-onlybatch_size: int = 1000

      The number of requests to add in one batch.

    • optionalkeyword-onlywait_time_between_batches: timedelta = timedelta(seconds=1)

      Time to wait between adding batches.

    • optionalkeyword-onlywait_for_all_requests_to_be_added: bool = False

      If True, wait for all requests to be added before returning.

    • optionalkeyword-onlywait_for_all_requests_to_be_added_timeout: timedelta | None = None

      Timeout for waiting for all requests to be added.

    Returns None

drop

  • async drop(): None
  • Removes persistent state either from the Apify Cloud storage or from the local database.


    Returns None

fetch_next_request

  • async fetch_next_request(): Request | None
  • Returns the next request to be processed, or null if there are no more pending requests.


    Returns Request | None

get_handled_count

  • async get_handled_count(): int
  • Returns the number of handled requests.


    Returns int

get_total_count

  • async get_total_count(): int
  • Returns an offline approximation of the total number of requests in the source (i.e. pending + handled).


    Returns int

is_empty

  • async is_empty(): bool
  • Returns True if there are no more requests in the source (there might still be unfinished requests).


    Returns bool

is_finished

  • async is_finished(): bool
  • Returns True if all requests have been handled.


    Returns bool

mark_request_as_handled

  • Marks a request as handled after a successful processing (or after giving up retrying).


    Parameters

    • optionalkeyword-onlyrequest: Request

    Returns ProcessedRequest | None

reclaim_request

  • Reclaims a failed request back to the source, so that it can be returned for processing later again.

    It is possible to modify the request data by supplying an updated request as a parameter.


    Parameters

    • optionalkeyword-onlyrequest: Request
    • optionalkeyword-onlyforefront: bool = False

    Returns ProcessedRequest | None

to_tandem

  • Combine the loader with a request manager to support adding and reclaiming requests.


    Parameters

    • optionalkeyword-onlyrequest_manager: RequestManager | None = None

      Request manager to combine the loader with. If None is given, the default request queue is used.

    Returns RequestManagerTandem