MemoryRequestQueueClient

Memory implementation of the request queue client.

No data is persisted between process runs, which means all requests are lost when the program terminates. This implementation is primarily useful for testing, development, and short-lived crawler runs where persistence is not required.

This client provides fast access to request data but is limited by available memory and does not support data sharing across different processes.

Hierarchy

RequestQueueClient
- MemoryRequestQueueClient

Index

Methods

init

__init__(*, metadata): None

Initialize a new instance.

Preferably use the MemoryRequestQueueClient.open class method to create a new instance.
Parameters
- keyword-onlymetadata: RequestQueueMetadata
Returns None

add_batch_of_requests

async add_batch_of_requests(requests, *, forefront): AddRequestsResponse

Overrides RequestQueueClient.add_batch_of_requests
Add batch of requests to the queue.

This method adds a batch of requests to the queue. Each request is processed based on its uniqueness (determined by unique_key). Duplicates will be identified but not re-added to the queue.
Parameters
- requests: Sequence[Request]
  The collection of requests to add to the queue.
- optionalkeyword-onlyforefront: bool = False
  Whether to put the added requests at the beginning (True) or the end (False) of the queue. When True, the requests will be processed sooner than previously added requests.
Returns AddRequestsResponse

drop

async drop(): None

Overrides RequestQueueClient.drop
Drop the whole request queue and remove all its values.

The backend method for the RequestQueue.drop call.
Returns None

fetch_next_request

async fetch_next_request(): Request | None

Overrides RequestQueueClient.fetch_next_request
Return the next request in the queue to be processed.

Once you successfully finish processing of the request, you need to call RequestQueue.mark_request_as_handled to mark the request as handled in the queue. If there was some error in processing the request, call RequestQueue.reclaim_request instead, so that the queue will give the request to some other consumer in another call to the fetch_next_request method.

Note that the None return value does not mean the queue processing finished, it means there are currently no pending requests. To check whether all requests in queue were finished, use RequestQueue.is_finished instead.
Returns Request | None

get_metadata

async get_metadata(): RequestQueueMetadata

Overrides RequestQueueClient.get_metadata
Get the metadata of the request queue.
Returns RequestQueueMetadata

get_request

async get_request(unique_key): Request | None

Overrides RequestQueueClient.get_request
Retrieve a request from the queue.
Parameters
- unique_key: str
  Unique key of the request to retrieve.
Returns Request | None

is_empty

async is_empty(): bool

Overrides RequestQueueClient.is_empty
Check if the queue is empty.
Returns bool

mark_request_as_handled

async mark_request_as_handled(request): ProcessedRequest | None

Overrides RequestQueueClient.mark_request_as_handled
Mark a request as handled after successful processing.

Handled requests will never again be returned by the RequestQueue.fetch_next_request method.
Parameters
- request: Request
  The request to mark as handled.
Returns ProcessedRequest | None

open

async open(*, id, name, alias): Self

Open or create a new memory request queue client.

This method creates a new in-memory request queue instance. Unlike persistent storage implementations, memory queues don't check for existing queues with the same name or ID since all data exists only in memory and is lost when the process terminates.

Alias does not have any effect on the memory storage client implementation, because unnamed storages are supported by default, since data are not persisted.
Parameters
- keyword-onlyid: str | None
  The ID of the request queue. If not provided, a random ID will be generated.
- keyword-onlyname: str | None
  The name of the request queue for named (global scope) storages.
- keyword-onlyalias: str | None
  The alias of the request queue for unnamed (run scope) storages.
Returns Self

purge

async purge(): None

Overrides RequestQueueClient.purge
Purge all items from the request queue.

The backend method for the RequestQueue.purge call.
Returns None

reclaim_request

async reclaim_request(request, *, forefront): ProcessedRequest | None

Overrides RequestQueueClient.reclaim_request
Reclaim a failed request back to the queue.

The request will be returned for processing later again by another call to RequestQueue.fetch_next_request.
Parameters
- request: Request
  The request to return to the queue.
- optionalkeyword-onlyforefront: bool = False
  Whether to add the request to the head or the end of the queue.
Returns ProcessedRequest | None

Hierarchy

Index

Methods

Methods

__init__

Parameters

keyword-onlymetadata: RequestQueueMetadata

Returns None

add_batch_of_requests

Parameters

requests: Sequence[Request]

optionalkeyword-onlyforefront: bool = False

Returns AddRequestsResponse

drop

Returns None

fetch_next_request

Returns Request | None

get_metadata

Returns RequestQueueMetadata

get_request

Parameters

unique_key: str

Returns Request | None

is_empty

Returns bool

mark_request_as_handled

Parameters

request: Request

Returns ProcessedRequest | None

open

Parameters

keyword-onlyid: str | None

keyword-onlyname: str | None

keyword-onlyalias: str | None

Returns Self

purge

Returns None

reclaim_request

Parameters

request: Request

optionalkeyword-onlyforefront: bool = False

Returns ProcessedRequest | None

init