Skip to main content
Version: 3.4

RouterHandler <Context>

Simple router that works based on request labels. This instance can then serve as a requestHandler of your crawler.

import { Router, CheerioCrawler, CheerioCrawlingContext } from 'crawlee';

const router = Router.create<CheerioCrawlingContext>();

// we can also use factory methods for specific crawling contexts, the above equals to:
// import { createCheerioRouter } from 'crawlee';
// const router = createCheerioRouter();

router.addHandler('label-a', async (ctx) => {
ctx.log.info('...');
});
router.addDefaultHandler(async (ctx) => {
ctx.log.info('...');
});

const crawler = new CheerioCrawler({
requestHandler: router,
});
await crawler.run();

Alternatively we can use the default router instance from crawler object:

import { CheerioCrawler } from 'crawlee';

const crawler = new CheerioCrawler();

crawler.router.addHandler('label-a', async (ctx) => {
ctx.log.info('...');
});
crawler.router.addDefaultHandler(async (ctx) => {
ctx.log.info('...');
});

await crawler.run();

For convenience, we can also define the routes right when creating the router:

import { CheerioCrawler, createCheerioRouter } from 'crawlee';
const crawler = new CheerioCrawler({
requestHandler: createCheerioRouter({
'label-a': async (ctx) => { ... },
'label-b': async (ctx) => { ... },
})},
});
await crawler.run();

Middlewares are also supported via the router.use method. There can be multiple middlewares for a single router, they will be executed sequentially in the same order as they were registered.

crawler.router.use(async (ctx) => {
ctx.log.info('...');
});

Hierarchy

  • Router<Context>
    • RouterHandler

Callable

  • RouterHandler(ctx: Context): Awaitable<void>

  • Parameters

    • ctx: Context

    Returns Awaitable<void>

Index

Methods

addDefaultHandler

  • addDefaultHandler<UserData>(handler: (ctx: Omit<Context, request> & { request: Request<UserData> }) => Awaitable<void>): void
  • Registers default route handler.


    Type parameters

    • UserData: Dictionary = GetUserDataFromRequest<Context[request]>

    Parameters

    • handler: (ctx: Omit<Context, request> & { request: Request<UserData> }) => Awaitable<void>

      Returns void

    addHandler

    • addHandler<UserData>(label: string | symbol, handler: (ctx: Omit<Context, request> & { request: Request<UserData> }) => Awaitable<void>): void
    • Registers new route handler for given label.


      Type parameters

      • UserData: Dictionary = GetUserDataFromRequest<Context[request]>

      Parameters

      • label: string | symbol
      • handler: (ctx: Omit<Context, request> & { request: Request<UserData> }) => Awaitable<void>

        Returns void

      getHandler

      • getHandler(label?: string | symbol): (ctx: Context) => Awaitable<void>
      • Returns route handler for given label. If no label is provided, the default request handler will be returned.


        Parameters

        • optionallabel: string | symbol

        Returns (ctx: Context) => Awaitable<void>

          • (ctx: Context): Awaitable<void>
          • Returns route handler for given label. If no label is provided, the default request handler will be returned.


            Parameters

            • ctx: Context

            Returns Awaitable<void>

      use

      • use(middleware: (ctx: Context) => Awaitable<void>): void
      • Registers a middleware that will be fired before the matching route handler. Multiple middlewares can be registered, they will be fired in the same order.


        Parameters

        • middleware: (ctx: Context) => Awaitable<void>

          Returns void