Skip to main content

ParsedHttpCrawlingContext

The crawling context used by AbstractHttpCrawler.

It provides access to key objects as well as utility functions for handling crawling tasks.

Hierarchy

Index

Methods

from_basic_crawling_context

  • from_basic_crawling_context(*, context, http_response): Self
  • Convenience constructor that creates HttpCrawlingContext from existing BasicCrawlingContext.


    Parameters

    Returns Self

from_http_crawling_context

  • from_http_crawling_context(*, context, parsed_content, enqueue_links): Self
  • Convenience constructor that creates new context from existing HttpCrawlingContext.


    Parameters

    Returns Self

Properties

add_requests

add_requests: AddRequestsFunction

Add requests crawling context helper function.

enqueue_links

enqueue_links: EnqueueLinksFunction

get_key_value_store

Get key-value store crawling context helper function.

http_response

http_response: HttpResponse

The HTTP response received from the server.

log

log: logging.Logger

Logger instance.

parsed_content

parsed_content: TParseResult

proxy_info

proxy_info: ProxyInfo | None

Proxy information for the current page being processed.

push_data

push_data: PushDataFunction

Push data crawling context helper function.

request

request: Request

Request object for the current page being processed.

send_request

send_request: SendRequestFunction

Send request crawling context helper function.

session

session: Session | None

Session object for the current page being processed.

use_state

use_state: UseStateFunction

Use state crawling context helper function.