HttpClient
Hierarchy
Index
Methods
Properties
Methods
__aenter__
Initialize the client when entering the context manager.
Returns HttpClient
__aexit__
Deinitialize the client and clean up resources when exiting the context manager.
Parameters
exc_type: BaseException | None
exc_value: BaseException | None
traceback: TracebackType | None
Returns None
__init__
Initialize a new instance.
Parameters
optionalkeyword-onlypersist_cookies_per_session: bool = True
Whether to persist cookies per HTTP session.
Returns None
cleanup
Clean up resources used by the client.
This method is called when the client is no longer needed and should be overridden in subclasses to perform any necessary cleanup such as closing connections, releasing file handles, or other resource deallocation.
Returns None
crawl
Perform the crawling for a given request.
This method is called from
crawler.run()
.Parameters
request: Request
The request to be crawled.
optionalkeyword-onlysession: Session | None = None
The session associated with the request.
optionalkeyword-onlyproxy_info: ProxyInfo | None = None
The information about the proxy to be used.
optionalkeyword-onlystatistics: Statistics | None = None
The statistics object to register status codes.
Returns HttpCrawlingResult
send_request
Send an HTTP request via the client.
This method is called from
context.send_request()
helper.Parameters
url: str
The URL to send the request to.
optionalkeyword-onlymethod: Literal[GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, TRACE, PATCH] = 'GET'
The HTTP method to use.
optionalkeyword-onlyheaders: (HttpHeaders | dict[str, str]) | None = None
The headers to include in the request.
optionalkeyword-onlypayload: bytes | None = None
The data to be sent as the request body.
optionalkeyword-onlysession: Session | None = None
The session associated with the request.
optionalkeyword-onlyproxy_info: ProxyInfo | None = None
The information about the proxy to be used.
Returns HttpResponse
stream
Stream an HTTP request via the client.
This method should be used for downloading potentially large data where you need to process the response body in chunks rather than loading it entirely into memory.
Parameters
url: str
The URL to send the request to.
optionalkeyword-onlymethod: Literal[GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, TRACE, PATCH] = 'GET'
The HTTP method to use.
optionalkeyword-onlyheaders: (HttpHeaders | dict[str, str]) | None = None
The headers to include in the request.
optionalkeyword-onlypayload: bytes | None = None
The data to be sent as the request body.
optionalkeyword-onlysession: Session | None = None
The session associated with the request.
optionalkeyword-onlyproxy_info: ProxyInfo | None = None
The information about the proxy to be used.
optionalkeyword-onlytimeout: timedelta | None = None
The maximum time to wait for establishing the connection.
Returns AbstractAsyncContextManager[HttpResponse]
Properties
active
Indicate whether the context is active.
An abstract base class for HTTP clients used in crawlers (
BasicCrawler
subclasses).