Skip to main content

EnqueueLinksFunction

A function type for enqueueing new URLs to crawl, based on elements selected by a CSS selector.

This function is used to extract and enqueue new URLs from the current page for further crawling.

Index

Methods

Methods

__call__

  • __call__(*, selector, label, user_data, kwargs): Coroutine[None, None, None]
  • A call dunder method.


    Parameters

    • optionalkeyword-onlyselector: str = 'a'

      CSS selector used to find the elements containing the links.

    • optionalkeyword-onlylabel: str | None = None

      Label for the newly created Request objects, used for request routing.

    • optionalkeyword-onlyuser_data: dict[str, Any] | None = None

      User data to be provided to the newly created Request objects.

    • keyword-onlylimit: int

      Maximum number of requests to be enqueued.

    • keyword-onlybase_url: str

      Base URL to be used for relative URLs.

    • keyword-onlystrategy: EnqueueStrategy

      Enqueueing strategy, see the EnqueueStrategy enum for possible values and their meanings.

    • keyword-onlyinclude: list[re.Pattern | Glob]

      List of regular expressions or globs that URLs must match to be enqueued.

    • keyword-onlyexclude: list[re.Pattern | Glob]

      List of regular expressions or globs that URLs must not match to be enqueued.

    Returns Coroutine[None, None, None]

Page Options