Quick Start
With this short tutorial you can start scraping with Crawlee in a minute or two. To learn in-depth how Crawlee works, read the Introduction, which is a comprehensive step-by-step guide for creating your first scraper.
Choose your crawler
Crawlee comes with three main crawler classes: CheerioCrawler
, PuppeteerCrawler
and PlaywrightCrawler
. All classes share the same interface for maximum flexibility when switching between them.
CheerioCrawler
This is a plain HTTP crawler. It parses HTML using the Cheerio library and crawls the web using the specialized got-scraping HTTP client which masks as a browser. It's very fast and efficient, but can't handle JavaScript rendering.
PuppeteerCrawler
This crawler uses a headless browser to crawl, controlled by the Puppeteer library. It can control Chromium or Chrome. Puppeteer is the de-facto standard in headless browser automation.