Skipping navigations for certain requests
While crawling a website, you may encounter certain resources you'd like to save, but don't need the full power of a crawler to do so (like images delivered through a CDN).
By combining the Request#skipNavigation option with sendRequest, we can fetch the image from the CDN, and save it to our key-value store without needing to use the full crawler.
info
For this example, we are using the PlaywrightCrawler to showcase this, but this is available on all the crawlers we provide.
Run on
import { PlaywrightCrawler, KeyValueStore } from 'crawlee';
// Create a key value store for all images we find
const imageStore = await KeyValueStore.open('images');
const crawler = new PlaywrightCrawler({
    async requestHandler({ request, page, sendRequest }) {
        // The request should have the navigation skipped
        if (request.skipNavigation) {
            // Request the image and get its buffer back
            const imageResponse = await sendRequest({ responseType: 'buffer' });
            // Save the image in the key-value store
            await imageStore.setValue(`${request.userData.key}.png`, imageResponse.body);
            // Prevent executing the rest of the code as we do not need it
            return;
        }
        // Get all the image sources in the current page
        const images = await page.$$eval('img', (imgs) => imgs.map((img) => img.src));
        // Add all the urls as requests for the crawler, giving each image a key
        await crawler.addRequests(images.map((url, i) => ({ url, skipNavigation: true, userData: { key: i } })));
    },
});
await crawler.addRequests(['https://crawlee.dev']);
// Run the crawler
await crawler.run();