Skip to main content

NoParser

Dummy parser for backwards compatibility.

To enable using HttpCrawler without need for additional specific parser.

Hierarchy

Index

Methods

find_links

  • find_links(*, parsed_content, selector): Iterable[str]
  • Find all links in result using selector.


    Parameters

    • optionalkeyword-onlyparsed_content: TParseResult

      Parsed HTTP response. Result of parse method.

    • optionalkeyword-onlyselector: str

      String used to define matching pattern for finding links.

    Returns Iterable[str]

is_blocked

  • Detect if blocked and return BlockedInfo with additional information.

    Default implementation that expects is_matching_selector abstract method to be implemented. Override this method if your parser has different way of blockage detection.


    Parameters

    • optionalkeyword-onlyparsed_content: TParseResult

      Parsed HTTP response. Result of parse method.

    Returns BlockedInfo

is_matching_selector

  • is_matching_selector(*, parsed_content, selector): bool
  • Find if selector has match in parsed content.


    Parameters

    • optionalkeyword-onlyparsed_content: TParseResult

      Parsed HTTP response. Result of parse method.

    • optionalkeyword-onlyselector: str

      String used to define matching pattern.

    Returns bool

parse