Setting up

This guide will help you get started with Crawlee by setting it up on your computer. Follow the steps below to ensure a smooth installation process.

Prerequisites

Before installing Crawlee itself, make sure that your system meets the following requirements:

Python 3.10 or higher: Crawlee requires Python 3.10 or a newer version. You can download Python from the official website.
Python package manager: While this guide uses pip (the most common package manager), you can also use any package manager you want. You can download pip from the official website.

Verifying prerequisites

To check if Python and pip are installed, run the following commands:

python --version

python -m pip --version

If these commands return the respective versions, you're ready to continue.

Installing Crawlee

Crawlee is available as crawlee package on PyPI. This package includes the core functionality, while additional features are available as optional extras to keep dependencies and package size minimal.

Basic installation

To install the core package, run:

python -m pip install crawlee

After installation, verify that Crawlee is installed correctly by checking its version:

python -c 'import crawlee; print(crawlee.__version__)'

Full installation

If you do not mind the package size, you can run the following command to install Crawlee with all optional features:

python -m pip install 'crawlee[all]'

Installing specific extras

Depending on your use case, you may want to install specific extras to enable additional functionality:

For using the BeautifulSoupCrawler, install the beautifulsoup extra:

python -m pip install 'crawlee[beautifulsoup]'

For using the ParselCrawler, install the parsel extra:

python -m pip install 'crawlee[parsel]'

For using the CurlImpersonateHttpClient, install the curl-impersonate extra:

python -m pip install 'crawlee[curl-impersonate]'

If you plan to use a (headless) browser with PlaywrightCrawler, install Crawlee with the playwright extra:

python -m pip install 'crawlee[playwright]'

After installing the playwright extra, install the necessary Playwright dependencies:

playwright install

Installing multiple extras

You can install multiple extras at once by using a comma as a separator:

python -m pip install 'crawlee[beautifulsoup,curl-impersonate]'

Start a new project

The quickest way to get started with Crawlee is by using the Crawlee CLI and selecting one of the prepared templates. The CLI helps you set up a new project in seconds.

Using Crawlee CLI with Pipx

First, ensure you have Pipx installed. You can check if Pipx is installed by running:

pipx --version

If Pipx is not installed, follow the official installation guide.

Then, run the Crawlee CLI using Pipx and choose from the available templates:

pipx run 'crawlee[cli]' create my-crawler

Using Crawlee CLI directly

If you already have crawlee installed, you can spin it up by running:

crawlee create my_crawler

Follow the interactive prompts in the CLI to choose a crawler type and set up your new project.

Running your project

To run your newly created project, navigate to the project directory, activate the virtual environment, and execute the Python interpreter with the project module:

Linux
Windows

cd my_crawler/

source .venv/bin/activate

python -m my_crawler

cd my_crawler/

venv\Scripts\activate

python -m my_crawler

Congratulations! You have successfully set up and executed your first Crawlee project.

Next steps

Next, you will learn how to create a very simple crawler and Crawlee components while building it.

Prerequisites​

Verifying prerequisites​

Installing Crawlee​

Basic installation​

Full installation​

Installing specific extras​

Installing multiple extras​

Start a new project​

Using Crawlee CLI with Pipx​

Using Crawlee CLI directly​

Running your project​

Next steps​