Skip to main content

SqlDatasetClient

SQL implementation of the dataset client.

This client persists dataset items to a SQL database using two tables for storage and retrieval. Items are stored as JSON with automatic ordering preservation.

The dataset data is stored in SQL database tables following the pattern:

  • datasets table: Contains dataset metadata (id, name, timestamps, item_count)
  • dataset_records table: Contains individual items with JSON data and auto-increment ordering

Items are stored as a JSON object in SQLite and as JSONB in PostgreSQL. These objects must be JSON-serializable. The item_id auto-increment primary key ensures insertion order is preserved. All operations are wrapped in database transactions with CASCADE deletion support.

Hierarchy

Index

Methods

__init__

  • __init__(*, id, storage_client): None
  • Initialize a new instance.

    Preferably use the SqlDatasetClient.open class method to create a new instance.


    Parameters

    Returns None

drop

  • async drop(): None
  • Delete this dataset and all its items from the database.

    This operation is irreversible. Uses CASCADE deletion to remove all related items.


    Returns None

get_data

  • async get_data(*, offset, limit, clean, desc, fields, omit, unwind, skip_empty, skip_hidden, flatten, view): DatasetItemsListPage
  • Get data from the dataset with various filtering options.

    The backend method for the Dataset.get_data call.


    Parameters

    • optionalkeyword-onlyoffset: int = 0
    • optionalkeyword-onlylimit: int | None = 999_999_999_999
    • optionalkeyword-onlyclean: bool = False
    • optionalkeyword-onlydesc: bool = False
    • optionalkeyword-onlyfields: list[str] | None = None
    • optionalkeyword-onlyomit: list[str] | None = None
    • optionalkeyword-onlyunwind: list[str] | None = None
    • optionalkeyword-onlyskip_empty: bool = False
    • optionalkeyword-onlyskip_hidden: bool = False
    • optionalkeyword-onlyflatten: list[str] | None = None
    • optionalkeyword-onlyview: str | None = None

    Returns DatasetItemsListPage

get_metadata

get_session

  • async get_session(*, with_simple_commit): AsyncIterator[AsyncSession]
  • Create a new SQLAlchemy session for this storage.


    Parameters

    • optionalkeyword-onlywith_simple_commit: bool = False

    Returns AsyncIterator[AsyncSession]

iterate_items

  • async iterate_items(*, offset, limit, clean, desc, fields, omit, unwind, skip_empty, skip_hidden): AsyncIterator[dict[str, Any]]
  • Iterate over the dataset items with filtering options.

    The backend method for the Dataset.iterate_items call.


    Parameters

    • optionalkeyword-onlyoffset: int = 0
    • optionalkeyword-onlylimit: int | None = None
    • optionalkeyword-onlyclean: bool = False
    • optionalkeyword-onlydesc: bool = False
    • optionalkeyword-onlyfields: list[str] | None = None
    • optionalkeyword-onlyomit: list[str] | None = None
    • optionalkeyword-onlyunwind: list[str] | None = None
    • optionalkeyword-onlyskip_empty: bool = False
    • optionalkeyword-onlyskip_hidden: bool = False

    Returns AsyncIterator[dict[str, Any]]

open

  • Open an existing dataset or create a new one.


    Parameters

    • keyword-onlyid: str | None

      The ID of the dataset to open. If provided, searches for existing dataset by ID.

    • keyword-onlyname: str | None

      The name of the dataset for named (global scope) storages.

    • keyword-onlyalias: str | None

      The alias of the dataset for unnamed (run scope) storages.

    • keyword-onlystorage_client: SqlStorageClient

      The SQL storage client instance.

    Returns SqlDatasetClient

purge

  • async purge(): None
  • Remove all items from this dataset while keeping the dataset structure.

    Resets item_count to 0 and deletes all records from dataset_records table.


    Returns None

push_data

  • async push_data(data): None
  • Push data to the dataset.

    The backend method for the Dataset.push_data call.


    Parameters

    • data: list[Any] | dict[str, Any]

    Returns None