RobotsFile
Index
Methods
getSitemaps
Get URLs of sitemaps referenced in the robots file.
Returns string[]
isAllowed
Check if a URL should be crawled by robots.
Parameters
url: string
the URL to check against the rules in robots.txt
optionaluserAgent: string = '*'
relevant user agent, default to
*
Returns boolean
parseSitemaps
Parse all the sitemaps referenced in the robots file.
Returns Promise<Sitemap>
parseUrlsFromSitemaps
Get all URLs from all the sitemaps referenced in the robots file. A shorthand for
(await robots.parseSitemaps()).urls
.Returns Promise<string[]>
staticfind
Determine the location of a robots.txt file for a URL and fetch it.
Parameters
url: string
the URL to fetch robots.txt for
optionalproxyUrl: string
a proxy to be used for fetching the robots.txt file
Returns Promise<RobotsFile>
staticfrom
Allows providing the URL and robots.txt content explicitly instead of loading it from the target site.
Parameters
url: string
the URL for robots.txt file
content: string
contents of robots.txt
optionalproxyUrl: string
a proxy to be used for fetching the robots.txt file
Returns RobotsFile
Loads and queries information from a robots.txt file.
Example usage: