Skip to main content
Version: 3.1

@crawlee/utils

Index

Type Aliases

CheerioRoot

CheerioRoot: ReturnType<typeof load>

Variables

constURL_NO_COMMAS_REGEX

URL_NO_COMMAS_REGEX: RegExp = ...

Default regular expression to match URLs in a string that may be plain text, JSON, CSV or other. It supports common URL characters and does not support URLs containing commas or spaces. The URLs also may contain Unicode letters (not symbols).

constURL_WITH_COMMAS_REGEX

URL_WITH_COMMAS_REGEX: RegExp = ...

Regular expression that, in addition to the default regular expression URL_NO_COMMAS_REGEX, supports matching commas in URL path and query. Note, however, that this may prevent parsing URLs from comma delimited lists, or the URLs may become malformed.