Skip to main content
Version: 3.15

Impit HTTP Client

Introduction

The ImpitHttpClient is an HTTP client implementation based on the Impit library. It enables browser impersonation for HTTP requests, helping you bypass bot detection systems without running an actual browser.

Successor to got-scraping

Impit is the successor to got-scraping, which is no longer actively maintained. We recommend using ImpitHttpClient for all new projects. Impit provides better anti-bot evasion through TLS fingerprinting and HTTP/3 support, while maintaining a smaller package size.

Impit will become the default HTTP client in the next major version of Crawlee.

Why use Impit?

Websites increasingly use sophisticated bot detection that analyzes:

  • HTTP fingerprints: User-Agent strings, header ordering, HTTP/2 pseudo-header sequences
  • TLS fingerprints: Cipher suites, TLS extensions, and cryptographic details in the ClientHello message

Standard HTTP clients like fetch or axios are easily detected because their fingerprints don't match real browsers. Unlike got-scraping which only handles HTTP-level fingerprinting, Impit also mimics TLS fingerprints, making requests appear to come from real browsers.

Installation

Install the @crawlee/impit-client package:

npm install @crawlee/impit-client
note

The impit package includes native binaries and supports Windows, macOS (including ARM), and Linux out of the box.

Basic usage

Pass the ImpitHttpClient instance to the httpClient option of any Crawlee crawler:

import { BasicCrawler } from 'crawlee';
import { ImpitHttpClient, Browser } from '@crawlee/impit-client';

const crawler = new BasicCrawler({
httpClient: new ImpitHttpClient({
browser: Browser.Firefox,
}),
async requestHandler({ sendRequest, log }) {
const response = await sendRequest();
log.info('Received response', { statusCode: response.statusCode });
},
});

await crawler.run(['https://example.com']);

Usage with different crawlers

CheerioCrawler

import { CheerioCrawler } from 'crawlee';
import { ImpitHttpClient, Browser } from '@crawlee/impit-client';

const crawler = new CheerioCrawler({
httpClient: new ImpitHttpClient({
browser: Browser.Chrome,
}),
async requestHandler({ $, request, enqueueLinks, pushData }) {
const title = $('title').text();
const h1 = $('h1').first().text();

await pushData({
url: request.url,
title,
h1,
});

// Enqueue links found on the page
await enqueueLinks();
},
});

await crawler.run(['https://example.com']);

HttpCrawler

import { HttpCrawler } from 'crawlee';
import { ImpitHttpClient, Browser } from '@crawlee/impit-client';

const crawler = new HttpCrawler({
httpClient: new ImpitHttpClient({
browser: Browser.Firefox,
http3: true,
}),
async requestHandler({ body, request, log, pushData }) {
log.info(`Processing ${request.url}`);

// body is the raw HTML string
await pushData({
url: request.url,
bodyLength: body.length,
});
},
});

await crawler.run(['https://example.com']);

Configuration options

The ImpitHttpClient constructor accepts the following options:

OptionTypeDefaultDescription
browser'chrome' | 'firefox'undefinedBrowser to impersonate. Affects TLS fingerprint and default headers.
http3booleanfalseEnable HTTP/3 (QUIC) protocol support.
ignoreTlsErrorsbooleanfalseIgnore TLS certificate errors. Useful for testing or self-signed certificates.

Browser impersonation

Use the Browser enum to specify which browser to impersonate:

import { ImpitHttpClient, Browser } from '@crawlee/impit-client';

// Impersonate Firefox
const firefoxClient = new ImpitHttpClient({ browser: Browser.Firefox });

// Impersonate Chrome
const chromeClient = new ImpitHttpClient({ browser: Browser.Chrome });

Advanced configuration

import { CheerioCrawler } from 'crawlee';
import { ImpitHttpClient, Browser } from '@crawlee/impit-client';

const crawler = new CheerioCrawler({
httpClient: new ImpitHttpClient({
// Impersonate Chrome browser
browser: Browser.Chrome,
// Enable HTTP/3 protocol
http3: true,
}),
async requestHandler({ $ }) {
console.log(`Title: ${$('title').text()}`);
},
});

await crawler.run(['https://example.com']);

Proxy support

Proxies are configured per-request through Crawlee's proxy management system, not on the ImpitHttpClient itself. Use ProxyConfiguration as you normally would:

import { CheerioCrawler, ProxyConfiguration } from 'crawlee';
import { ImpitHttpClient, Browser } from '@crawlee/impit-client';

const proxyConfiguration = new ProxyConfiguration({
proxyUrls: ['http://proxy1.example.com:8080', 'http://proxy2.example.com:8080'],
});

const crawler = new CheerioCrawler({
httpClient: new ImpitHttpClient({ browser: Browser.Chrome }),
proxyConfiguration,
async requestHandler({ $, request }) {
console.log(`Scraped ${request.url}`);
},
});

How it works

Impit achieves browser impersonation at two levels:

  1. HTTP level: Mimics browser-specific header ordering, HTTP/2 settings, and pseudo-header sequences that antibot services analyze.

  2. TLS level: Uses a patched version of rustls to replicate the exact TLS ClientHello message that browsers send, including cipher suites and extensions.

This dual-layer approach makes requests appear to come from a real browser, significantly reducing blocks from bot detection systems.

Comparison with other solutions

Featuregot-scrapingcurl-impersonateImpit
TLS fingerprintingNoYesYes
HTTP/3 supportNoYesYes
Native Node.js packageYesNo (child process)Yes
Windows/macOS ARMYesNoYes
Package size~10 MB~20 MB~8 MB

Related links