BrowserPool
Index
Methods
__aenter__
Enter the context manager and initialize all browser plugins.
Returns BrowserPool
__aexit__
Exit the context manager and close all browser plugins.
Parameters
optionalkeyword-onlyexc_type: type[BaseException] | None
optionalkeyword-onlyexc_value: BaseException | None
optionalkeyword-onlyexc_traceback: TracebackType | None
Returns None
__init__
A default constructor.
Parameters
optionalkeyword-onlyplugins: Sequence[BaseBrowserPlugin] | None = None
Browser plugins serve as wrappers around various browser automation libraries, providing a consistent interface across different libraries.
optionalkeyword-onlyoperation_timeout: timedelta = timedelta(seconds=15)
Operations of the underlying automation libraries, such as launching a browser or opening a new page, can sometimes get stuck. To prevent
BrowserPool
from becoming unresponsive, we add a timeout to these operations.optionalkeyword-onlybrowser_inactive_threshold: timedelta = timedelta(seconds=10)
The period of inactivity after which a browser is considered as inactive.
optionalkeyword-onlyidentify_inactive_browsers_interval: timedelta = timedelta(seconds=20)
The period of inactivity after which a browser is considered as retired.
optionalkeyword-onlyclose_inactive_browsers_interval: timedelta = timedelta(seconds=30)
The interval at which the pool checks for inactive browsers and closes them. The browser is considered as inactive if it has no active pages and has been idle for the specified period.
Returns None
new_page
Open a new page in a browser using the specified or a random browser plugin.
Parameters
optionalkeyword-onlypage_id: str | None = None
The ID to assign to the new page. If not provided, a random ID is generated.
optionalkeyword-onlybrowser_plugin: BaseBrowserPlugin | None = None
browser_plugin: The browser plugin to use for creating the new page. If not provided, the next plugin in the rotation is used.
optionalkeyword-onlyproxy_info: ProxyInfo | None = None
The proxy configuration to use for the new page.
Returns CrawleePage
new_page_with_each_plugin
Create a new page with each browser plugin in the pool.
This method is useful for running scripts in multiple environments simultaneously, typically for testing or website analysis. Each page is created using a different browser plugin, allowing you to interact with various browser types concurrently.
Returns Sequence[CrawleePage]
with_default_plugin
Create a new instance with a single
PlaywrightBrowserPlugin
configured with the provided options.Parameters
optionalkeyword-onlybrowser_type: BrowserType | None = None
The type of browser to launch ('chromium', 'firefox', or 'webkit').
optionalkeyword-onlybrowser_launch_options: Mapping[str, Any] | None = None
Keyword arguments to pass to the browser launch method. These options are provided directly to Playwright's
browser_type.launch
method. For more details, refer to the Playwright documentation: https://playwright.dev/python/docs/api/class-browsertype#browser-type-launch.optionalkeyword-onlybrowser_new_context_options: Mapping[str, Any] | None = None
Keyword arguments to pass to the browser new context method. These options are provided directly to Playwright's
browser.new_context
method. For more details, refer to the Playwright documentation: https://playwright.dev/python/docs/api/class-browser#browser-new-context.optionalkeyword-onlyheadless: bool | None = None
Whether to run the browser in headless mode.
optionalkeyword-onlykwargs: Any
Additional arguments for default constructor.
Returns BrowserPool
Properties
active
Indicate whether the context is active.
active_browsers
Return the active browsers in the pool.
inactive_browsers
Return the inactive browsers in the pool.
pages
Return the pages in the pool.
plugins
Return the browser plugins.
total_pages_count
Return the total number of pages opened since the browser pool was launched.
Manage a pool of browsers and pages, handling their lifecycle and resource allocation.
The
BrowserPool
is responsible for opening and closing browsers, managing pages within those browsers, and handling the overall lifecycle of these resources. It provides flexible configuration via constructor options, which include various hooks that allow for the insertion of custom behavior at different stages of the browser and page lifecycles.The browsers in the pool can be in one of three states: active, inactive, or closed.