EnqueueLinksFunction
Index
Methods
Methods
__call__
A call dunder method.
Parameters
optionalkeyword-onlyselector: str = 'a'
A selector used to find the elements containing the links. The behaviour differs based on the crawler used:
PlaywrightCrawler
supports CSS and XPath selectors.ParselCrawler
supports CSS selectors.BeautifulSoupCrawler
supports CSS selectors.
optionalkeyword-onlylabel: str | None = None
Label for the newly created
Request
objects, used for request routing.optionalkeyword-onlyuser_data: dict[str, Any] | None = None
User data to be provided to the newly created
Request
objects.optionalkeyword-onlytransform_request_function: Callable[[RequestOptions], RequestOptions | RequestTransformAction] | None = None
A function that takes
RequestOptions
and returns either:- Modified
RequestOptions
to update the request configuration, 'skip'
to exclude the request from being enqueued,'unchanged'
to use the original request options without modification.
- Modified
keyword-onlyoptionallimit: int
Maximum number of requests to be enqueued.
keyword-onlyoptionalbase_url: str
Base URL to be used for relative URLs.
keyword-onlyoptionalstrategy: EnqueueStrategy
Enqueueing strategy, see the
EnqueueStrategy
enum for possible values and their meanings.keyword-onlyoptionalinclude: list[re.Pattern | Glob]
List of regular expressions or globs that URLs must match to be enqueued.
keyword-onlyoptionalexclude: list[re.Pattern | Glob]
List of regular expressions or globs that URLs must not match to be enqueued.
Returns Coroutine[None, None, None]
A function for enqueueing new URLs to crawl based on elements selected by a given selector.
It extracts URLs from the current page and enqueues them for further crawling. It allows filtering through selectors and other options. You can also specify labels and user data to be associated with the newly created
Request
objects.