Changelog
All notable changes to this project will be documented in this file. See Conventional Commits for commit guidelines.
3.11.5 (2024-10-04)
Bug Fixes
- check
.isFinished()
beforeRequestList
reads (#2695) (6fa170f) - core: trigger
errorHandler
for session errors (#2683) (7d72bcb), closes #2678
3.11.4 (2024-09-23)
Note: Version bump only for package @crawlee/basic
3.11.3 (2024-09-03)
Note: Version bump only for package @crawlee/basic
3.11.2 (2024-08-28)
Bug Fixes
3.11.1 (2024-07-24)
Note: Version bump only for package @crawlee/basic
3.11.0 (2024-07-09)
Features
3.10.5 (2024-06-12)
Bug Fixes
3.10.4 (2024-06-11)
Bug Fixes
- add missing
useState
implementation into crawling context (eec4a71) - make
crawler.log
publicly accessible (#2526) (3e9e665) - respect
crawler.log
when creating child logger forStatistics
(0a0d75d), closes #2412
3.10.3 (2024-06-07)
Features
- log desired concurrency in the default status message (9f0b796)
3.10.2 (2024-06-03)
Note: Version bump only for package @crawlee/basic
3.10.1 (2024-05-23)
Note: Version bump only for package @crawlee/basic
3.10.0 (2024-05-16)
Bug Fixes
EnqueueStrategy.All
erroring with links using unsupported protocols (#2389) (8db3908)- do not drop statistics on migration/resurrection/resume (#2462) (8ce7dd4)
Features
- implement ErrorSnapshotter for error context capture (#2332) (e861dfd), closes #2280
- make
RequestQueue
v2 the default queue, see more on Apify blog (#2390) (41ae8ab), closes #2388
3.9.2 (2024-04-17)
Bug Fixes
3.9.1 (2024-04-11)
Note: Version bump only for package @crawlee/basic
3.9.0 (2024-04-10)
Bug Fixes
Features
3.8.2 (2024-03-21)
Note: Version bump only for package @crawlee/basic
3.8.1 (2024-02-22)
Note: Version bump only for package @crawlee/basic
3.8.0 (2024-02-21)
Bug Fixes
- declare missing dependencies on
csv-stringify
andfs-extra
(#2326) (718959d), closes /github.com/redabacha/crawlee/blob/2f05ed22b203f688095300400bb0e6d03a03283c/.eslintrc.json#L50
Features
- accessing crawler state, key-value store and named datasets via crawling context (#2283) (58dd5fc)
- adaptive playwright crawler (#2316) (8e4218a)
3.7.3 (2024-01-30)
Note: Version bump only for package @crawlee/basic
3.7.2 (2024-01-09)
Note: Version bump only for package @crawlee/basic
3.7.1 (2024-01-02)
Note: Version bump only for package @crawlee/basic
3.7.0 (2023-12-21)
Features
- allow configuring crawler statistics (#2213) (9fd60e4), closes #1789
- check enqueue link strategy post redirect (#2238) (3c5f9d6), closes #2173
- log cause with
retryOnBlocked
(#2252) (e19a773), closes #2249
3.6.2 (2023-11-26)
Note: Version bump only for package @crawlee/basic
3.6.1 (2023-11-15)
Note: Version bump only for package @crawlee/basic
3.6.0 (2023-11-15)
Features
3.5.8 (2023-10-17)
Note: Version bump only for package @crawlee/basic
3.5.7 (2023-10-05)
Bug Fixes
- add warning when we detect use of RL and RQ, but RQ is not provided explicitly (#2115) (6fb1c55), closes #1773
- ensure the status message cannot stuck the crawler (#2114) (9034f08)
- RQ request count is consistent after migration (#2116) (9ab8c18), closes #1855 #1855
3.5.6 (2023-10-04)
Note: Version bump only for package @crawlee/basic
3.5.5 (2023-10-02)
Bug Fixes
Features
3.5.4 (2023-09-11)
Features
3.5.3 (2023-08-31)
Bug Fixes
- browser-pool: improve error handling when browser is not found (#2050) (282527f), closes #1459
- clean up
inProgress
cache when delaying requests viasameDomainDelaySecs
(#2045) (f63ccc0) - pin all internal dependencies (#2041) (d6f2b17), closes #2040
- respect current config when creating implicit
RequestQueue
instance (845141d), closes #2043
Features
3.5.2 (2023-08-21)
Note: Version bump only for package @crawlee/basic
3.5.1 (2023-08-16)
Features
3.5.0 (2023-07-31)
Features
- add support for
sameDomainDelay
(#2003) (e796883), closes #1993 - basic-crawler: allow configuring the automatic status message (#2001) (3eb4e4c)
- retire session on proxy error (#2002) (8c0928b), closes #1912
3.4.2 (2023-07-19)
Bug Fixes
- basic-crawler: limit
internalTimeoutMillis
in addition torequestHandlerTimeoutMillis
(#1981) (8122622), closes #1766
Features
- core: add
RequestQueue.addRequestsBatched()
that is non-blocking (#1996) (c85485d), closes #1995 - retryOnBlocked detects blocked webpage (#1956) (766fa9b)
3.4.1 (2023-07-13)
Note: Version bump only for package @crawlee/basic
3.4.0 (2023-06-12)
Note: Version bump only for package @crawlee/basic
3.3.3 (2023-05-31)
Bug Fixes
Features
3.3.2 (2023-05-11)
Bug Fixes
Features
- allow running single crawler instance multiple times (#1844) (9e6eb1e), closes #765
- router: allow inline router definition (#1877) (2d241c9)
3.3.1 (2023-04-11)
Bug Fixes
- start status message logger after the crawl actually starts (5d1df7a)
- status message - total requests (#1842) (710f734)
3.3.0 (2023-03-09)
Features
- add basic support for
setStatusMessage
(#1790) (c318980) - move the status message implementation to Crawlee, noop in storage (#1808) (99c3fdc)
3.2.2 (2023-02-08)
Note: Version bump only for package @crawlee/basic
3.2.1 (2023-02-07)
Note: Version bump only for package @crawlee/basic
3.2.0 (2023-02-07)
Bug Fixes
3.1.4 (2022-12-14)
Bug Fixes
- session.markBad() on requestHandler error (#1709) (e87eb1f), closes #1635 /github.com/apify/crawlee/blob/5ff04faa85c3a6b6f02cd58a91b46b80610d8ae6/packages/browser-crawler/src/internals/browser-crawler.ts#L524
3.1.3 (2022-12-07)
Bug Fixes
Features
3.1.2 (2022-11-15)
Note: Version bump only for package @crawlee/basic
3.1.1 (2022-11-07)
Note: Version bump only for package @crawlee/basic
3.1.0 (2022-10-13)
Note: Version bump only for package @crawlee/basic
3.0.4 (2022-08-22)
Note: Version bump only for package @crawlee/basic