Skip to main content

Session

crawlee.sessions._session.Session

Session object represents a single user session.

Sessions are used to store information such as cookies and can be used for generating fingerprints and proxy sessions. You can imagine each session as a specific user, with its own cookies, IP (via proxy) and potentially a unique browser fingerprint. Session internal state can be enriched with custom user data for example some authorization tokens and specific headers in general.

Index

Errors

error_score

error_score: float

Get the current error score.

Constructors

__init__

  • __init__(*, id, max_age, user_data, max_error_score, error_score_decrement, created_at, usage_count, max_usage_count, error_score, cookies, blocked_status_codes): None
  • Create a new instance.


    Parameters

    • id: str | None = Nonekeyword-only
    • max_age: timedelta = timedelta(minutes=50)keyword-only
    • user_data: dict | None = Nonekeyword-only
    • max_error_score: float = 3.0keyword-only
    • error_score_decrement: float = 0.5keyword-only
    • created_at: datetime | None = Nonekeyword-only
    • usage_count: int = 0keyword-only
    • max_usage_count: int = 50keyword-only
    • error_score: float = 0.0keyword-only
    • cookies: dict | None = Nonekeyword-only
    • blocked_status_codes: list | None = Nonekeyword-only

    Returns None

Methods

__eq__

  • __eq__(other): bool
  • Compare two sessions for equality.


    Parameters

    • other: object

    Returns bool

__repr__

  • __repr__(): str
  • Get a string representation.


    Returns str

from_model

  • from_model(model): Session
  • Create a new instance from a SessionModel.


    Parameters

    • model: SessionModel

    Returns Session

get_state

  • get_state(*, as_dict): dict
  • Parameters

    • as_dict: Literal[True]keyword-only

    Returns dict

get_state

  • get_state(*, as_dict): SessionModel
  • Parameters

    • as_dict: Literal[False]keyword-only

    Returns SessionModel

get_state

  • get_state(*, as_dict): SessionModel | dict
  • Retrieve the current state of the session either as a model or as a dictionary.


    Parameters

    • as_dict: bool = Falsekeyword-only

    Returns SessionModel | dict

is_blocked_status_code

  • is_blocked_status_code(*, status_code, additional_blocked_status_codes): bool
  • Evaluate whether a session should be retired based on the received HTTP status code.


    Parameters

    • status_code: intkeyword-only
    • additional_blocked_status_codes: list[int] | None = Nonekeyword-only

    Returns bool

mark_bad

  • mark_bad(): None
  • Mark the session as bad after an unsuccessful session usage.


    Returns None

mark_good

  • mark_good(): None
  • Mark the session as good. Should be called after a successful session usage.


    Returns None

retire

  • retire(): None
  • Retire the session by setting the error score to the maximum value.

    This method should be used if the session usage was unsuccessful and you are sure that it is because of the session configuration and not any external matters. For example when server returns 403 status code. If the session does not work due to some external factors as server error such as 5XX you probably want to use mark_bad method.


    Returns None

Properties

cookies

cookies: dict

Get the cookies.

expires_at

expires_at: datetime

Get the expiration datetime of the session.

id

id: str

Get the session ID.

is_blocked

is_blocked: bool

Indicate whether the session is blocked based on the error score..

is_expired

is_expired: bool

Indicate whether the session is expired based on the current time.

is_max_usage_count_reached

is_max_usage_count_reached: bool

Indicate whether the session has reached its maximum usage limit.

is_usable

is_usable: bool

Determine if the session is usable for next requests.

usage_count

usage_count: float

Get the current usage count.

user_data

user_data: dict

Get the user data.