BaseHttpClient
Hierarchy
- BaseHttpClient
Index
Methods
__init__
A default constructor.
Parameters
optionalkeyword-onlypersist_cookies_per_session: bool = True
Whether to persist cookies per HTTP session.
optionalkeyword-onlyadditional_http_error_status_codes: Iterable[int] = ()
Additional HTTP status codes to treat as errors.
optionalkeyword-onlyignore_http_error_status_codes: Iterable[int] = ()
HTTP status codes to ignore as errors.
Returns None
crawl
Perform the crawling for a given request.
This method is called from
crawler.run()
.Parameters
optionalkeyword-onlyrequest: Request
The request to be crawled.
optionalkeyword-onlysession: Session | None = None
The session associated with the request.
optionalkeyword-onlyproxy_info: ProxyInfo | None = None
The information about the proxy to be used.
optionalkeyword-onlystatistics: Statistics | None = None
The statistics object to register status codes.
Returns HttpCrawlingResult
send_request
Send an HTTP request via the client.
This method is called from
context.send_request()
helper.Parameters
optionalkeyword-onlyurl: str
The URL to send the request to.
optionalkeyword-onlymethod: HttpMethod = 'GET'
The HTTP method to use.
optionalkeyword-onlyheaders: (HttpHeaders | dict[str, str]) | None = None
The headers to include in the request.
optionalkeyword-onlypayload: HttpPayload | None = None
The data to be sent as the request body.
optionalkeyword-onlysession: Session | None = None
The session associated with the request.
optionalkeyword-onlyproxy_info: ProxyInfo | None = None
The information about the proxy to be used.
Returns HttpResponse
An abstract base class for HTTP clients used in crawlers (
BasicCrawler
subclasses).The specific HTTP client should use
_raise_for_error_status_code
method for checking the status code. This way the consistent behaviour accross different HTTP clients can be maintained. It raises anHttpStatusCodeError
when it encounters an error response, defined by default as any HTTP status code in the range of 400 to 599. The error handling behavior is customizable, allowing the user to specify additional status codes to treat as errors or to exclude specific status codes from being considered errors. Seeadditional_http_error_status_codes
andignore_http_error_status_codes
arguments in the constructor.