puppeteerClickElements
Index
References
Interfaces
Functions
References
enqueueLinksByClickingElements
Interfaces
EnqueueLinksByClickingElementsOptions
optionalclickOptions
optionalforefront
If set to true
:
- while adding the request to the queue: the request will be added to the foremost position in the queue.
- while reclaiming the request: the request will be placed to the beginning of the queue, so that it's returned in the next call to RequestQueue.fetchNextRequest. By default, it's put to the end of the queue.
optionalglobs
An array of glob pattern strings or plain objects containing glob pattern strings matching the URLs to be enqueued.
The plain objects must include at least the glob
property, which holds the glob pattern string.
All remaining keys will be used as request options for the corresponding enqueued Request objects.
The matching is always case-insensitive.
If you need case-sensitive matching, use regexps
property directly.
If globs
is an empty array or undefined
, then the function
enqueues all the intercepted navigation requests produced by the page
after clicking on elements matching the provided CSS selector.
optionallabel
Sets Request.label for newly enqueued requests.
optionalmaxWaitForPageIdleSecs
This is the maximum period for which the function will keep tracking events, even if more events keep coming.
Its purpose is to prevent a deadlock in the page by periodic events, often unrelated to the clicking itself.
See waitForPageIdleSecs
above for an explanation.
page
Puppeteer Page
object.
optionalpseudoUrls
NOTE: In future versions of SDK the options will be removed.
Please use globs
or regexps
instead.
An array of PseudoUrl strings or plain objects containing PseudoUrl strings matching the URLs to be enqueued.
The plain objects must include at least the purl
property, which holds the pseudo-URL pattern string.
All remaining keys will be used as request options for the corresponding enqueued Request objects.
With a pseudo-URL string, the matching is always case-insensitive.
If you need case-sensitive matching, use regexps
property directly.
If pseudoUrls
is an empty array or undefined
, then the function
enqueues all the intercepted navigation requests produced by the page
after clicking on elements matching the provided CSS selector.
optionalregexps
An array of regular expressions or plain objects containing regular expressions matching the URLs to be enqueued.
The plain objects must include at least the regexp
property, which holds the regular expression.
All remaining keys will be used as request options for the corresponding enqueued Request objects.
If regexps
is an empty array or undefined
, then the function
enqueues all the intercepted navigation requests produced by the page
after clicking on elements matching the provided CSS selector.
requestQueue
A request queue to which the URLs will be enqueued.
selector
A CSS selector matching elements to be clicked on. Unlike in enqueueLinks, there is no default value. This is to prevent suboptimal use of this function by using it too broadly.
optionalskipNavigation
If set to true
, tells the crawler to skip navigation and process the request directly.
optionaltransformRequestFunction
Just before a new Request is constructed and enqueued to the RequestQueue, this function can be used
to remove it or modify its contents such as userData
, payload
or, most importantly uniqueKey
. This is useful
when you need to enqueue multiple Requests
to the queue that share the same URL, but differ in methods or payloads,
or to dynamically update or create userData
.
For example: by adding useExtendedUniqueKey: true
to the request
object, uniqueKey
will be computed from
a combination of url
, method
and payload
which enables crawling of websites that navigate using form submits
(POST requests).
Example:
{
transformRequestFunction: (request) => {
request.userData.foo = 'bar';
request.useExtendedUniqueKey = true;
return request;
}
}
optionaluserData
Sets Request.userData for newly enqueued requests.
optionalwaitForPageIdleSecs
Clicking in the page triggers various asynchronous operations that lead to new URLs being shown by the browser. It could be a simple JavaScript redirect or opening of a new tab in the browser. These events often happen only some time after the actual click. Requests typically take milliseconds while new tabs open in hundreds of milliseconds.
To be able to capture all those events, the enqueueLinksByClickingElements()
function repeatedly waits
for the waitForPageIdleSecs
. By repeatedly we mean that whenever a relevant event is triggered, the timer
is restarted. As long as new events keep coming, the function will not return, unless
the below maxWaitForPageIdleSecs
timeout is reached.
You may want to reduce this for example when you're sure that your clicks do not open new tabs, or increase when you're not getting all the expected URLs.
Functions
isTargetRelevant
We're only interested in pages created by the page we're currently clicking in. There will generally be a lot of other targets being created in the browser.
Parameters
page: Page
target: Target
Returns boolean
Click options for use in Puppeteer's click handler.