BeautifulSoupCrawlingContext
Hierarchy
- ParsedHttpCrawlingContext
- BeautifulSoupCrawlingContext
Index
Methods
from_basic_crawling_context
Convenience constructor that creates
HttpCrawlingContext
from existingBasicCrawlingContext
.Parameters
optionalkeyword-onlycontext: BasicCrawlingContext
optionalkeyword-onlyhttp_response: HttpResponse
Returns Self
from_http_crawling_context
Convenience constructor that creates new context from existing HttpCrawlingContext.
Parameters
optionalkeyword-onlycontext: HttpCrawlingContext
optionalkeyword-onlyparsed_content: TParseResult
optionalkeyword-onlyenqueue_links: EnqueueLinksFunction
Returns Self
from_parsed_http_crawling_context
Convenience constructor that creates new context from existing
ParsedHttpCrawlingContext[BeautifulSoup]
.Parameters
optionalkeyword-onlycontext: ParsedHttpCrawlingContext[BeautifulSoup]
Returns Self
html_to_text
Convert the parsed HTML content to newline-separated plain text without tags.
Returns str
Properties
add_requests
enqueue_links
get_key_value_store
http_response
The HTTP response received from the server.
log
parsed_content
proxy_info
push_data
request
send_request
session
soup
Convenience alias.
The crawling context used by the
BeautifulSoupCrawler
.It provides access to key objects as well as utility functions for handling crawling tasks.