Skip to main content

Snapshotter

crawlee.autoscaling.snapshotter.Snapshotter

Monitors and logs system resource usage at predefined intervals for performance optimization.

The class monitors and records the state of various system resources (CPU, memory, event loop, and client API) at predefined intervals. This continuous monitoring helps in identifying resource overloads and ensuring optimal performance of the application. It is utilized in the AutoscaledPool module to adjust task allocation dynamically based on the current demand and system load.

Index

Constructors

__init__

  • __init__(event_manager, *, event_loop_snapshot_interval, client_snapshot_interval, max_used_cpu_ratio, max_memory_size, max_used_memory_ratio, max_event_loop_delay, max_client_errors, snapshot_history, reserve_memory_ratio, memory_warning_cooldown_period, client_rate_limit_error_retry_count): None
  • Creates a new instance.


    Parameters

    • event_manager: EventManager
    • event_loop_snapshot_interval: timedelta = timedelta(milliseconds=500)keyword-only
    • client_snapshot_interval: timedelta = timedelta(milliseconds=1000)keyword-only
    • max_used_cpu_ratio: float = 0.95keyword-only
    • max_memory_size: ByteSize | None = Nonekeyword-only
    • max_used_memory_ratio: float = 0.7keyword-only
    • max_event_loop_delay: timedelta = timedelta(milliseconds=50)keyword-only
    • max_client_errors: int = 1keyword-only
    • snapshot_history: timedelta = timedelta(seconds=30)keyword-only
    • reserve_memory_ratio: float = 0.5keyword-only
    • memory_warning_cooldown_period: timedelta = timedelta(milliseconds=10000)keyword-only
    • client_rate_limit_error_retry_count: int = 2keyword-only

    Returns None

Methods

__aenter__

  • async __aenter__(): Snapshotter
  • Starts capturing snapshots at configured intervals.


    Returns Snapshotter

__aexit__

  • async __aexit__(exc_type, exc_value, exc_traceback): None
  • Stops all resource capturing.

    This method stops capturing snapshots of system resources (CPU, memory, event loop, and client information). It should be called to terminate resource capturing when it is no longer needed.


    Parameters

    • exc_type: type[BaseException] | None
    • exc_value: BaseException | None
    • exc_traceback: TracebackType | None

    Returns None

get_client_sample

  • get_client_sample(duration): list[Snapshot]
  • Parameters

    • duration: timedelta | None = None

    Returns list[Snapshot]

get_cpu_sample

  • get_cpu_sample(duration): list[Snapshot]
  • Parameters

    • duration: timedelta | None = None

    Returns list[Snapshot]

get_event_loop_sample

  • get_event_loop_sample(duration): list[Snapshot]
  • Parameters

    • duration: timedelta | None = None

    Returns list[Snapshot]

get_memory_sample

  • get_memory_sample(duration): list[Snapshot]
  • Parameters

    • duration: timedelta | None = None

    Returns list[Snapshot]