Skip to main content

Router

A request dispatching system that routes requests to registered handlers based on their labels.

The Router allows you to define and register request handlers for specific labels. When a request is received, the router invokes the corresponding request_handler based on the request's label. If no matching handler is found, the default handler is used.

Usage

from crawlee.crawlers import HttpCrawler, HttpCrawlingContext
from crawlee.router import Router

router = Router[HttpCrawlingContext]()


# Handler for requests without a matching label handler
@router.default_handler
async def default_handler(context: HttpCrawlingContext) -> None:
context.log.info(f'Request without label {context.request.url} ...')


# Handler for category requests
@router.handler(label='category')
async def category_handler(context: HttpCrawlingContext) -> None:
context.log.info(f'Category request {context.request.url} ...')


# Handler for product requests
@router.handler(label='product')
async def product_handler(context: HttpCrawlingContext) -> None:
context.log.info(f'Product {context.request.url} ...')


async def main() -> None:
crawler = HttpCrawler(request_handler=router)
await crawler.run()

Index

Methods

__call__

  • async __call__(context): None
  • Invoke a request handler that matches the request label (or the default).


    Parameters

    Returns None

__init__

  • __init__(): None
  • Returns None

default_handler

handler

  • handler(label): Callable[[RequestHandler[TCrawlingContext]], Callable[[TCrawlingContext], Awaitable]]
  • Register a request handler based on a label.

    This decorator registers a request handler for a specific label. The handler will be invoked only for requests that have the exact same label.


    Parameters

    • label: str

    Returns Callable[[RequestHandler[TCrawlingContext]], Callable[[TCrawlingContext], Awaitable]]