Skip to main content

BeautifulSoupCrawlingContext

The crawling context used by the BeautifulSoupCrawler.

It provides access to key objects as well as utility functions for handling crawling tasks.

Hierarchy

Index

Methods

from_basic_crawling_context

  • from_basic_crawling_context(*, context, http_response): Self
  • Convenience constructor that creates HttpCrawlingContext from existing BasicCrawlingContext.


    Parameters

    Returns Self

from_http_crawling_context

  • from_http_crawling_context(*, context, parsed_content, enqueue_links): Self
  • Convenience constructor that creates new context from existing HttpCrawlingContext.


    Parameters

    Returns Self

from_parsed_http_crawling_context

  • from_parsed_http_crawling_context(*, context): Self
  • Convenience constructor that creates new context from existing ParsedHttpCrawlingContext[BeautifulSoup].


    Parameters

    Returns Self

html_to_text

  • html_to_text(): str
  • Convert the parsed HTML content to newline-separated plain text without tags.


    Returns str

Properties

add_requests

add_requests: AddRequestsFunction

Add requests crawling context helper function.

enqueue_links

enqueue_links: EnqueueLinksFunction

get_key_value_store

Get key-value store crawling context helper function.

http_response

http_response: HttpResponse

The HTTP response received from the server.

log

log: logging.Logger

Logger instance.

parsed_content

parsed_content: TParseResult

proxy_info

proxy_info: ProxyInfo | None

Proxy information for the current page being processed.

push_data

push_data: PushDataFunction

Push data crawling context helper function.

request

request: Request

Request object for the current page being processed.

send_request

send_request: SendRequestFunction

Send request crawling context helper function.

session

session: Session | None

Session object for the current page being processed.

soup

soup: BeautifulSoup

Convenience alias.

use_state

use_state: UseStateFunction

Use state crawling context helper function.