Skip to main content
Version: Next

discoverValidSitemaps

Callable

  • discoverValidSitemaps(urls, options): AsyncIterable<string>

  • Given a list of URLs, discover related sitemap files for these domains by checking the robots.txt file, the default sitemap.xml & sitemap.txt files and the URLs themselves.


    Parameters

    • urls: string[]
    • options: { proxyUrl?: string; requestTimeoutMillis?: number; signal?: AbortSignal; timeoutMillis?: number } = {}
      • optionalproxyUrl: string

        Proxy URL to be used for network requests.

      • optionalrequestTimeoutMillis: number

        Timeout in milliseconds for each individual HTTP request during discovery. Defaults to 20000 ms (20 seconds).

      • optionalsignal: AbortSignal

        An external AbortSignal to cancel the entire discovery operation. If both signal and timeout are provided, the operation is cancelled when either the signal is aborted or the timeout elapses (whichever comes first).

      • optionaltimeoutMillis: number

        Timeout in milliseconds for the entire discoverValidSitemaps call. An AbortController is created internally and its signal is passed to every HTTP request, so the whole discovery operation is cancelled once the timeout elapses. Defaults to 60_000 ms (60 seconds) to prevent indefinite hangs.

    Returns AsyncIterable<string>

    An async iterable with the discovered sitemap URLs.