Skip to main content
Version: 3.8

RestrictedCrawlingContext <UserData>

Hierarchy

Index

Properties

addRequests

addRequests: (requestsLike: readonly (string | ReadonlyObjectDeep<Partial<RequestOptions<Dictionary>> & { regex?: RegExp; requestsFromUrl?: string }> | ReadonlyObjectDeep<Request<Dictionary>>)[], options?: ReadonlyObjectDeep<RequestQueueOperationOptions>) => Promise<void>

Type declaration

    • (requestsLike: readonly (string | ReadonlyObjectDeep<Partial<RequestOptions<Dictionary>> & { regex?: RegExp; requestsFromUrl?: string }> | ReadonlyObjectDeep<Request<Dictionary>>)[], options?: ReadonlyObjectDeep<RequestQueueOperationOptions>): Promise<void>
    • Add requests directly to the request queue.


      Parameters

      • requestsLike: readonly (string | ReadonlyObjectDeep<Partial<RequestOptions<Dictionary>> & { regex?: RegExp; requestsFromUrl?: string }> | ReadonlyObjectDeep<Request<Dictionary>>)[]
      • optionaloptions: ReadonlyObjectDeep<RequestQueueOperationOptions>

        Options for the request queue

      Returns Promise<void>

enqueueLinks

enqueueLinks: (options?: ReadonlyObjectDeep<Omit<EnqueueLinksOptions, requestQueue>>) => Promise<unknown>

Type declaration

    • This function automatically finds and enqueues links from the current page, adding them to the RequestQueue currently used by the crawler.

      Optionally, the function allows you to filter the target links' URLs using an array of globs or regular expressions and override settings of the enqueued Request objects.

      Check out the Crawl a website with relative links example for more details regarding its usage.

      Example usage

      async requestHandler({ enqueueLinks }) {
      await enqueueLinks({
      globs: [
      'https://www.example.com/handbags/*',
      ],
      });
      },

      Parameters

      • optionaloptions: ReadonlyObjectDeep<Omit<EnqueueLinksOptions, requestQueue>>

        All enqueueLinks() parameters are passed via an options object.

      Returns Promise<unknown>

getKeyValueStore

getKeyValueStore: (idOrName?: string) => Promise<Pick<KeyValueStore, id | name | getValue | getAutoSavedValue | setValue>>

Type declaration

    • (idOrName?: string): Promise<Pick<KeyValueStore, id | name | getValue | getAutoSavedValue | setValue>>
    • Get a key-value store with given name or id, or the default one for the crawler.


      Parameters

      • optionalidOrName: string

      Returns Promise<Pick<KeyValueStore, id | name | getValue | getAutoSavedValue | setValue>>

log

log: Log

A preconfigured logger for the request handler.

request

request: Request<UserData>

The original Request object.

useState

useState: <State>(defaultValue?: State) => Promise<State>

Type declaration

    • <State>(defaultValue?: State): Promise<State>
    • Returns the state - a piece of mutable persistent data shared across all the request handler runs.


      Type parameters

      • State: Dictionary = Dictionary

      Parameters

      • optionaldefaultValue: State

      Returns Promise<State>

Methods

pushData

  • pushData(data?: ReadonlyDeep<Dictionary | Dictionary[]>, datasetIdOrName?: string): Promise<void>
  • This function allows you to push data to a Dataset specified by name, or the one currently used by the crawler.

    Shortcut for crawler.pushData().


    Parameters

    • optionaldata: ReadonlyDeep<Dictionary | Dictionary[]>

      Data to be pushed to the default dataset.

    • optionaldatasetIdOrName: string

    Returns Promise<void>