KeyValueStore
Hierarchy
- BaseStorage
- KeyValueStore
Index
Methods
__init__
Parameters
optionalkeyword-onlyid: str
optionalkeyword-onlyname: str | None
optionalkeyword-onlystorage_client: BaseStorageClient
Returns None
drop
Drop the storage, removing it from the underlying storage client and clearing the cache.
Returns None
get_auto_saved_value
Gets a value from KVS that will be automatically saved on changes.
Parameters
optionalkeyword-onlykey: str
Key of the record, to store the value.
optionalkeyword-onlydefault_value: dict[str, JsonSerializable] | None = None
Value to be used if the record does not exist yet. Should be a dictionary.
Returns dict[str, JsonSerializable]
get_info
Get an object containing general information about the key value store.
Returns KeyValueStoreMetadata | None
get_public_url
Get the public URL for the given key.
Parameters
optionalkeyword-onlykey: str
Key of the record for which URL is required.
Returns str
get_value
Parameters
optionalkeyword-onlykey: str
Returns Any
get_value
get_value
get_value
iterate_keys
Iterate over the existing keys in the KVS.
Parameters
optionalkeyword-onlyexclusive_start_key: str | None = None
Key to start the iteration from.
Returns AsyncIterator[KeyValueStoreKeyInfo]
open
Open a storage, either restore existing or create a new one.
Parameters
optionalkeyword-onlyid: str | None = None
The storage ID.
optionalkeyword-onlyname: str | None = None
The storage name.
optionalkeyword-onlyconfiguration: Configuration | None = None
Configuration object used during the storage creation or restoration process.
optionalkeyword-onlystorage_client: BaseStorageClient | None = None
Underlying storage client to use. If not provided, the default global storage client from the service locator will be used.
Returns BaseStorage
persist_autosaved_values
Force persistent values to be saved without waiting for an event in Event Manager.
Returns None
set_value
Set a value in the KVS.
Parameters
optionalkeyword-onlykey: str
Key of the record to set.
optionalkeyword-onlyvalue: Any
Value to set. If
None
, the record is deleted.optionalkeyword-onlycontent_type: str | None = None
Content type of the record.
Returns None
Properties
id
Get the storage ID.
name
Get the storage name.
Represents a key-value based storage for reading and writing data records or files.
Each data record is identified by a unique key and associated with a specific MIME content type. This class is commonly used in crawler runs to store inputs and outputs, typically in JSON format, but it also supports other content types.
Data can be stored either locally or in the cloud. It depends on the setup of underlying storage client. By default a
MemoryStorageClient
is used, but it can be changed to a different one.By default, data is stored using the following path structure:
{CRAWLEE_STORAGE_DIR}
: The root directory for all storage data specified by the environment variable.{STORE_ID}
: The identifier for the key-value store, either "default" or as specified byCRAWLEE_DEFAULT_KEY_VALUE_STORE_ID
.{KEY}
: The unique key for the record.{EXT}
: The file extension corresponding to the MIME type of the content.To open a key-value store, use the
open
class method, providing anid
,name
, or optionalconfiguration
. If none are specified, the default store for the current crawler run is used. Attempting to open a store byid
that does not exist will raise an error; however, if accessed byname
, the store will be created if it does not already exist.Usage