Skip to main content
Version: 3.11

Changelog

All notable changes to this project will be documented in this file. See Conventional Commits for commit guidelines.

3.11.0 (2024-07-09)

Features

  • Sitemap-based request list implementation (#2498) (7bf8f0b)

3.10.5 (2024-06-12)

Bug Fixes

  • mark context.request.loadedUrl and id as required inside the request handler (#2531) (2b54660)

3.10.4 (2024-06-11)

Bug Fixes

  • add missing useState implementation into crawling context (eec4a71)
  • make crawler.log publicly accessible (#2526) (3e9e665)
  • respect crawler.log when creating child logger for Statistics (0a0d75d), closes #2412

3.10.3 (2024-06-07)

Features

  • log desired concurrency in the default status message (9f0b796)

3.10.2 (2024-06-03)

Note: Version bump only for package @crawlee/basic

3.10.1 (2024-05-23)

Note: Version bump only for package @crawlee/basic

3.10.0 (2024-05-16)

Bug Fixes

  • EnqueueStrategy.All erroring with links using unsupported protocols (#2389) (8db3908)
  • do not drop statistics on migration/resurrection/resume (#2462) (8ce7dd4)

Features

3.9.2 (2024-04-17)

Bug Fixes

3.9.1 (2024-04-11)

Note: Version bump only for package @crawlee/basic

3.9.0 (2024-04-10)

Bug Fixes

  • notify autoscaled pool about newly added requests (#2400) (a90177d)

Features

3.8.2 (2024-03-21)

Note: Version bump only for package @crawlee/basic

3.8.1 (2024-02-22)

Note: Version bump only for package @crawlee/basic

3.8.0 (2024-02-21)

Bug Fixes

Features

  • accessing crawler state, key-value store and named datasets via crawling context (#2283) (58dd5fc)
  • adaptive playwright crawler (#2316) (8e4218a)

3.7.3 (2024-01-30)

Note: Version bump only for package @crawlee/basic

3.7.2 (2024-01-09)

Note: Version bump only for package @crawlee/basic

3.7.1 (2024-01-02)

Note: Version bump only for package @crawlee/basic

3.7.0 (2023-12-21)

Features

3.6.2 (2023-11-26)

Note: Version bump only for package @crawlee/basic

3.6.1 (2023-11-15)

Note: Version bump only for package @crawlee/basic

3.6.0 (2023-11-15)

Features

3.5.8 (2023-10-17)

Note: Version bump only for package @crawlee/basic

3.5.7 (2023-10-05)

Bug Fixes

  • add warning when we detect use of RL and RQ, but RQ is not provided explicitly (#2115) (6fb1c55), closes #1773
  • ensure the status message cannot stuck the crawler (#2114) (9034f08)
  • RQ request count is consistent after migration (#2116) (9ab8c18), closes #1855 #1855

3.5.6 (2023-10-04)

Note: Version bump only for package @crawlee/basic

3.5.5 (2023-10-02)

Bug Fixes

Features

3.5.4 (2023-09-11)

Features

  • remove side effect from the deprecated error context augmentation (#2069) (f9fb5c4)

3.5.3 (2023-08-31)

Bug Fixes

  • browser-pool: improve error handling when browser is not found (#2050) (282527f), closes #1459
  • clean up inProgress cache when delaying requests via sameDomainDelaySecs (#2045) (f63ccc0)
  • pin all internal dependencies (#2041) (d6f2b17), closes #2040
  • respect current config when creating implicit RequestQueue instance (845141d), closes #2043

Features

  • core: add default dataset helpers to BasicCrawler (#2057) (e2a7544)

3.5.2 (2023-08-21)

Note: Version bump only for package @crawlee/basic

3.5.1 (2023-08-16)

Features

  • exceeding maxSessionRotations calls failedRequestHandler (#2029) (b1cb108), closes #2028

3.5.0 (2023-07-31)

Features

3.4.2 (2023-07-19)

Bug Fixes

  • basic-crawler: limit internalTimeoutMillis in addition to requestHandlerTimeoutMillis (#1981) (8122622), closes #1766

Features

  • core: add RequestQueue.addRequestsBatched() that is non-blocking (#1996) (c85485d), closes #1995
  • retryOnBlocked detects blocked webpage (#1956) (766fa9b)

3.4.1 (2023-07-13)

Note: Version bump only for package @crawlee/basic

3.4.0 (2023-06-12)

Note: Version bump only for package @crawlee/basic

3.3.3 (2023-05-31)

Bug Fixes

  • set status message every 5 seconds and log it via debug level (#1918) (32aede6)

Features

  • core: add Request.maxRetries to allow overriding the maxRequestRetries (#1925) (c5592db)

3.3.2 (2023-05-11)

Bug Fixes

  • respect config object when creating SessionPool (#1881) (db069df)

Features

  • allow running single crawler instance multiple times (#1844) (9e6eb1e), closes #765
  • router: allow inline router definition (#1877) (2d241c9)

3.3.1 (2023-04-11)

Bug Fixes

  • start status message logger after the crawl actually starts (5d1df7a)
  • status message - total requests (#1842) (710f734)

3.3.0 (2023-03-09)

Features

  • add basic support for setStatusMessage (#1790) (c318980)
  • move the status message implementation to Crawlee, noop in storage (#1808) (99c3fdc)

3.2.2 (2023-02-08)

Note: Version bump only for package @crawlee/basic

3.2.1 (2023-02-07)

Note: Version bump only for package @crawlee/basic

3.2.0 (2023-02-07)

Bug Fixes

  • declare missing dependency on tslib (27e96c8), closes #1747

3.1.4 (2022-12-14)

Bug Fixes

3.1.3 (2022-12-07)

Bug Fixes

Features

  • always show error origin if inside the userland (#1677) (bbe9045)

3.1.2 (2022-11-15)

Note: Version bump only for package @crawlee/basic

3.1.1 (2022-11-07)

Note: Version bump only for package @crawlee/basic

3.1.0 (2022-10-13)

Note: Version bump only for package @crawlee/basic

3.0.4 (2022-08-22)

Note: Version bump only for package @crawlee/basic