CurlImpersonateHttpClient
Hierarchy
- BaseHttpClient
- CurlImpersonateHttpClient
Index
Methods
__init__
A default constructor.
Parameters
optionalkeyword-onlypersist_cookies_per_session: bool = True
Whether to persist cookies per HTTP session.
optionalkeyword-onlyadditional_http_error_status_codes: Iterable[int] = ()
Additional HTTP status codes to treat as errors.
optionalkeyword-onlyignore_http_error_status_codes: Iterable[int] = ()
HTTP status codes to ignore as errors.
optionalkeyword-onlyasync_session_kwargs: Any
Additional keyword arguments for
curl_cffi.requests.AsyncSession
.
Returns None
crawl
Perform the crawling for a given request.
This method is called from
crawler.run()
.Parameters
optionalkeyword-onlyrequest: Request
The request to be crawled.
optionalkeyword-onlysession: Session | None = None
The session associated with the request.
optionalkeyword-onlyproxy_info: ProxyInfo | None = None
The information about the proxy to be used.
optionalkeyword-onlystatistics: Statistics | None = None
The statistics object to register status codes.
Returns HttpCrawlingResult
send_request
Send an HTTP request via the client.
This method is called from
context.send_request()
helper.Parameters
optionalkeyword-onlyurl: str
The URL to send the request to.
optionalkeyword-onlymethod: HttpMethod = 'GET'
The HTTP method to use.
optionalkeyword-onlyheaders: (HttpHeaders | dict[str, str]) | None = None
The headers to include in the request.
optionalkeyword-onlypayload: HttpPayload | None = None
The data to be sent as the request body.
optionalkeyword-onlysession: Session | None = None
The session associated with the request.
optionalkeyword-onlyproxy_info: ProxyInfo | None = None
The information about the proxy to be used.
Returns HttpResponse
HTTP client based on the
curl-cffi
library.This client uses the
curl-cffi
library to perform HTTP requests in crawlers (BasicCrawler
subclasses) and to manage sessions, proxies, and error handling.See the
BaseHttpClient
class for more common information about HTTP clients.Usage