Skip to main content

Setting up

To run Crawlee on your own computer, you need to meet the following pre-requisites first:

  1. Have Node.js version 16.0 or higher installed.
  2. Have NPM installed, or use other package manager of your choice.
    • NPM comes bundled with Node.js, so you should already have it. If not, reinstall Node.js.

If not certain, confirm the prerequisites by running:

node -v
npm -v

Creating a new project

The fastest and best way to create new projects with Crawlee is to use the Crawlee CLI. You can use the npx utility to download and run the CLI - it is also embedded in the crawlee package:

npx crawlee create my-new-project

A prompt will be shown, asking you to choose a template. Crawlee is written in TypeScript so if you're familiar with it, choosing a TypeScript template will give you better code completion and static type checking, but feel free to use JavaScript as well. Functionally they're identical.

Let's choose the first template called Crawlee playwright template. The command will create a new directory in your current working directory, called my-new-project, add a package.json to this folder and install all the necessary dependencies. It will also add example source code that you can immediately run.

Let's try that!

cd my-new-project
npm start

You will see log messages in the terminal as Crawlee boots up and after a second a Chromium browser window will open. In the window, you'll see quickly changing pages and back in the terminal, you will see the printed titles (contents of the <title> HTML tags) of the pages.

Chrome Scrape

info

We picked the Playwright template, which uses Chromium to open pages. If you pick the Cheerio template instead, there won't be any browser window, as the requests to the target site will be done via a specialized HTTP client: got-scraping, instead of a browser.

You can always terminate the crawl with a keypress in the terminal:

CTRL+C

Next lesson

The next lesson will teach you how to create a very simple crawler and explain Crawlee components while building it.