Skip to main content
Version: 3.8

RequestHandlerResult

experimental

A partial implementation of RestrictedCrawlingContext that stores parameters of calls to context methods for later inspection.

Index

Constructors

constructor

Properties

addRequests

addRequests: (requestsLike: readonly (string | ReadonlyObjectDeep<Partial<RequestOptions<Dictionary>> & { regex?: RegExp; requestsFromUrl?: string }> | ReadonlyObjectDeep<Request<Dictionary>>)[], options?: ReadonlyObjectDeep<RequestQueueOperationOptions>) => Promise<void> = ...

Type declaration

    • (requestsLike: readonly (string | ReadonlyObjectDeep<Partial<RequestOptions<Dictionary>> & { regex?: RegExp; requestsFromUrl?: string }> | ReadonlyObjectDeep<Request<Dictionary>>)[], options?: ReadonlyObjectDeep<RequestQueueOperationOptions>): Promise<void>
    • Add requests directly to the request queue.


      Parameters

      • requestsLike: readonly (string | ReadonlyObjectDeep<Partial<RequestOptions<Dictionary>> & { regex?: RegExp; requestsFromUrl?: string }> | ReadonlyObjectDeep<Request<Dictionary>>)[]
      • optionaloptions: ReadonlyObjectDeep<RequestQueueOperationOptions>

        Options for the request queue

      Returns Promise<void>

enqueueLinks

enqueueLinks: (options?: ReadonlyObjectDeep<Omit<EnqueueLinksOptions, requestQueue>>) => Promise<unknown> = ...

Type declaration

    • This function automatically finds and enqueues links from the current page, adding them to the RequestQueue currently used by the crawler.

      Optionally, the function allows you to filter the target links' URLs using an array of globs or regular expressions and override settings of the enqueued Request objects.

      Check out the Crawl a website with relative links example for more details regarding its usage.

      Example usage

      async requestHandler({ enqueueLinks }) {
      await enqueueLinks({
      globs: [
      'https://www.example.com/handbags/*',
      ],
      });
      },

      Parameters

      • optionaloptions: ReadonlyObjectDeep<Omit<EnqueueLinksOptions, requestQueue>>

        All enqueueLinks() parameters are passed via an options object.

      Returns Promise<unknown>

getKeyValueStore

getKeyValueStore: (idOrName?: string) => Promise<Pick<KeyValueStore, id | name | getValue | getAutoSavedValue | setValue>> = ...

Type declaration

    • (idOrName?: string): Promise<Pick<KeyValueStore, id | name | getValue | getAutoSavedValue | setValue>>
    • Get a key-value store with given name or id, or the default one for the crawler.


      Parameters

      • optionalidOrName: string

      Returns Promise<Pick<KeyValueStore, id | name | getValue | getAutoSavedValue | setValue>>

pushData

pushData: (data?: ReadonlyDeep<Dictionary | Dictionary[]>, datasetIdOrName?: string) => Promise<void> = ...

Type declaration

    • (data?: ReadonlyDeep<Dictionary | Dictionary[]>, datasetIdOrName?: string): Promise<void>
    • This function allows you to push data to a Dataset specified by name, or the one currently used by the crawler.

      Shortcut for crawler.pushData().


      Parameters

      • optionaldata: ReadonlyDeep<Dictionary | Dictionary[]>

        Data to be pushed to the default dataset.

      • optionaldatasetIdOrName: string

      Returns Promise<void>

useState

useState: <State>(defaultValue?: State) => Promise<State> = ...

Type declaration

    • <State>(defaultValue?: State): Promise<State>
    • Returns the state - a piece of mutable persistent data shared across all the request handler runs.


      Type parameters

      • State: Dictionary = Dictionary

      Parameters

      • optionaldefaultValue: State

      Returns Promise<State>

Accessors

calls

  • get calls(): ReadonlyObjectDeep<{ addRequests: [requestsLike: readonly (string | ReadonlyObjectDeep<Partial<RequestOptions<Dictionary>> & { regex?: RegExp; requestsFromUrl?: string }> | ReadonlyObjectDeep<Request<Dictionary>>)[], options?: ReadonlyObjectDeep<RequestQueueOperationOptions>][]; enqueueLinks: [options?: ReadonlyObjectDeep<Omit<EnqueueLinksOptions, requestQueue>>][]; pushData: [data: ReadonlyDeep<Dictionary | Dictionary[]>, datasetIdOrName?: string][] }>

datasetItems

  • get datasetItems(): readonly ReadonlyObjectDeep<{ datasetIdOrName?: string; item: Dictionary }>[]
  • Items added to datasets by a request handler.


    Returns readonly ReadonlyObjectDeep<{ datasetIdOrName?: string; item: Dictionary }>[]

enqueuedUrlLists

  • get enqueuedUrlLists(): readonly ReadonlyObjectDeep<{ label?: string; listUrl: string }>[]
  • URL lists enqueued to the request queue by a request handler via RestrictedCrawlingContext.addRequests using the requestsFromUrl option.


    Returns readonly ReadonlyObjectDeep<{ label?: string; listUrl: string }>[]

enqueuedUrls

  • get enqueuedUrls(): readonly ReadonlyObjectDeep<{ label?: string; url: string }>[]

keyValueStoreChanges

  • get keyValueStoreChanges(): ReadonlyObjectDeep<Record<string, Record<string, { changedValue: unknown; options?: RecordOptions }>>>
  • A record of changes made to key-value stores by a request handler.


    Returns ReadonlyObjectDeep<Record<string, Record<string, { changedValue: unknown; options?: RecordOptions }>>>