HttpxHttpClient
Hierarchy
- BaseHttpClient
- HttpxHttpClient
Index
Methods
__init__
A default constructor.
Parameters
optionalkeyword-onlypersist_cookies_per_session: bool = True
Whether to persist cookies per HTTP session.
optionalkeyword-onlyadditional_http_error_status_codes: Iterable[int] = ()
Additional HTTP status codes to treat as errors.
optionalkeyword-onlyignore_http_error_status_codes: Iterable[int] = ()
HTTP status codes to ignore as errors.
optionalkeyword-onlyhttp1: bool = True
Whether to enable HTTP/1.1 support.
optionalkeyword-onlyhttp2: bool = True
Whether to enable HTTP/2 support.
optionalkeyword-onlyverify: (str | bool) | SSLContext = True
SSL certificates used to verify the identity of requested hosts.
optionalkeyword-onlyheader_generator: HeaderGenerator | None = _DEFAULT_HEADER_GENERATOR
Header generator instance to use for generating common headers.
optionalkeyword-onlyasync_client_kwargs: Any
Additional keyword arguments for
httpx.AsyncClient
.
Returns None
crawl
Perform the crawling for a given request.
This method is called from
crawler.run()
.Parameters
optionalkeyword-onlyrequest: Request
The request to be crawled.
optionalkeyword-onlysession: Session | None = None
The session associated with the request.
optionalkeyword-onlyproxy_info: ProxyInfo | None = None
The information about the proxy to be used.
optionalkeyword-onlystatistics: Statistics | None = None
The statistics object to register status codes.
Returns HttpCrawlingResult
send_request
Send an HTTP request via the client.
This method is called from
context.send_request()
helper.Parameters
optionalkeyword-onlyurl: str
The URL to send the request to.
optionalkeyword-onlymethod: HttpMethod = 'GET'
The HTTP method to use.
optionalkeyword-onlyheaders: (HttpHeaders | dict[str, str]) | None = None
The headers to include in the request.
optionalkeyword-onlypayload: HttpPayload | None = None
The data to be sent as the request body.
optionalkeyword-onlysession: Session | None = None
The session associated with the request.
optionalkeyword-onlyproxy_info: ProxyInfo | None = None
The information about the proxy to be used.
Returns HttpResponse
HTTP client based on the
HTTPX
library.This client uses the
HTTPX
library to perform HTTP requests in crawlers (BasicCrawler
subclasses) and to manage sessions, proxies, and error handling.See the
BaseHttpClient
class for more common information about HTTP clients.Usage