SqlStorageClient
Hierarchy
- StorageClient
- SqlStorageClient
Index
Methods
__aenter__
Async context manager entry.
Returns SqlStorageClient
__aexit__
Async context manager exit.
Parameters
exc_type: type[BaseException] | None
exc_value: BaseException | None
exc_traceback: TracebackType | None
Returns None
__init__
Initialize the SQL storage client.
Parameters
optionalkeyword-onlyconnection_string: str | None = None
Database connection string (e.g., "sqlite+aiosqlite:///crawlee.db"). If not provided, defaults to SQLite database in the storage directory.
optionalkeyword-onlyengine: AsyncEngine | None = None
Pre-configured AsyncEngine instance. If provided, connection_string is ignored.
Returns None
close
Close the database connection pool.
Returns None
create_dataset_client
Create a dataset client.
Parameters
optionalkeyword-onlyid: str | None = None
optionalkeyword-onlyname: str | None = None
optionalkeyword-onlyalias: str | None = None
optionalkeyword-onlyconfiguration: Configuration | None = None
Returns DatasetClient
create_kvs_client
Create a key-value store client.
Parameters
optionalkeyword-onlyid: str | None = None
optionalkeyword-onlyname: str | None = None
optionalkeyword-onlyalias: str | None = None
optionalkeyword-onlyconfiguration: Configuration | None = None
Returns KeyValueStoreClient
create_rq_client
Create a request queue client.
Parameters
optionalkeyword-onlyid: str | None = None
optionalkeyword-onlyname: str | None = None
optionalkeyword-onlyalias: str | None = None
optionalkeyword-onlyconfiguration: Configuration | None = None
Returns RequestQueueClient
create_session
Create a new database session.
Returns AsyncSession
get_accessed_modified_update_interval
Get the interval for accessed and modified updates.
Returns timedelta
get_dialect_name
Get the database dialect name.
Returns str | None
get_rate_limit_errors
Return statistics about rate limit errors encountered by the HTTP client in storage client.
Returns dict[int, int]
get_storage_client_cache_key
Return a cache key that can differentiate between different storages of this and other clients.
Can be based on configuration or on the client itself. By default, returns a module and name of the client class.
Parameters
configuration: Configuration
Returns Hashable
initialize
Initialize the database schema.
This method creates all necessary tables if they don't exist. Should be called before using the storage client.
Parameters
configuration: Configuration
Returns None
Properties
engine
Get the SQLAlchemy AsyncEngine instance.
SQL implementation of the storage client.
This storage client provides access to datasets, key-value stores, and request queues that persist data to a SQL database using SQLAlchemy 2+. Each storage type uses two tables: one for metadata and one for records.
The client accepts either a database connection string or a pre-configured AsyncEngine. If neither is provided, it creates a default SQLite database 'crawlee.db' in the storage directory.
Database schema is automatically created during initialization. SQLite databases receive performance optimizations including WAL mode and increased cache size.