Build reliable web scrapers. Fast.

Crawlee is a web scraping library for JavaScript and Python. It handles blocking, crawling, proxies, and browsers for you.

Get startedStar

Run on

import { PlaywrightCrawler } from 'crawlee';

const crawler = new PlaywrightCrawler({
    async requestHandler({ request, page, enqueueLinks, pushData, log }) {
        const title = await page.title();
        log.info(`Title of ${request.loadedUrl} is '${title}'`);

        await pushData({ title, url: request.loadedUrl });
        await enqueueLinks();
    },

    // Uncomment this option to see the browser window.
    // headless: false,
});

await crawler.run(['https://crawlee.dev']);

Or start with a template from our CLI

$npx crawlee create my-crawler

Built with 🤍 by Apify. Forever free and open-source.

What are the benefits?

Unblock websites by default

Crawlee crawls stealthily with zero configuration, but you can customize its behavior to overcome any protection. Real-world fingerprints included.

Learn more

{
    fingerprintOptions: {
        fingerprintGeneratorOptions: {
            browsers: ['chrome', 'firefox'],
            devices: ['mobile'],
            locales: ['en-US'],
        },
    },
},

Work with your favorite tools

Crawlee integrates BeautifulSoup, Cheerio, Puppeteer, Playwright, and other popular open-source tools. No need to learn new syntax.

Learn more

One API for headless and HTTP

Switch between HTTP and headless without big rewrites thanks to a shared API. Or even let Adaptive crawler decide if JS rendering is needed.

Learn more

const crawler = new AdaptivePlaywrightCrawler({
    renderingTypeDetectionRatio: 0.1,
    async requestHandler({ querySelector, enqueueLinks }) {
        // The crawler detects if JS rendering is needed
        // to extract this data. If not, it will use HTTP
        // for follow-up requests to save time and costs.
        const $prices = await querySelector('span.price')
        await enqueueLinks();
    },
});

What else is in Crawlee?

Auto scaling

Crawlers automatically adjust concurrency based on available system resources. Avoid memory errors in small containers and run faster in large ones.

Smart proxy rotation

Crawlee uses a pool of sessions represented by different proxies to maintain the proxy performance and keep IPs healthy. Blocked proxies are removed from the pool automatically.

Queue and storage

Pause and resume crawlers thanks to a persistent queue of URLs and storage for structured data.

Handy scraping utils

Sitemaps, infinite scroll, contact extraction, large asset blocking and many more utils included.

Routing & middleware

Keep your code clean and organized while managing complex crawls with a built-in router that streamlines the process.

Deploy to cloud

Crawlee, by Apify, works anywhere, but Apify offers the best experience. Easily turn your project into an Actor—a serverless micro-app with built-in infra, proxies, and storage.

Deploy to Apify

Install Apify SDK and Apify CLI.

Add

Actor.init()

to the begining and

Actor.exit()

to the end of your code.

Use the Apify CLI to push the code to the Apify platform.

Crawlee helps you build scrapers faster

Zero setup required

Copy code example, install Crawlee and go. No CLI required, no complex file structure, no boilerplate.

Get started

Reasonable defaults

Unblocking, proxy rotation and other core features are already turned on. But also very configurable.

Learn more

Helpful community

Join our Discord community of over 10k developers and get fast answers to your web scraping questions.

Join Discord

Get started now!

Crawlee won’t fix broken selectors for you (yet), but it makes building and maintaining reliable crawlers faster and easier—so you can focus on what matters most.

Get started