class tensorstore.Batch

Batches are used to group together read operations for potentially improved efficiency.

Operations associated with a batch will potentially be deferred until all references to the batch are released.

The batch behavior of any particular operation ultimately depends on the underlying driver implementation, but in many cases batching operations can reduce the number of separate I/O requests performed.

Example usage as a context manager (recommended):

>>> store = ts.open(
...     {
...         'driver': 'zarr3',
...         'kvstore': {
...             'driver': 'file',
...             'path': 'tmp/dataset/'
...         },
...     },
...     shape=[5, 6],
...     chunk_layout=ts.ChunkLayout(read_chunk_shape=[2, 3],
...                                 write_chunk_shape=[6, 6]),
...     dtype=ts.uint16,
...     create=True,
...     delete_existing=True).result()
>>> store[...] = np.arange(5 * 6, dtype=np.uint16).reshape([5, 6])
>>> with ts.Batch() as batch:
...     read_future1 = store[:3].read(batch=batch)
...     read_future2 = store[3:].read(batch=batch)
>>> await read_future1
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17]], dtype=uint16)
>>> await read_future2
array([[18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29]], dtype=uint16)

Warning

Any operation performed as part of a batch may be deferred until the batch is submitted. Blocking on (or awaiting) the completion of such an operation while retaining a reference to the batch will likely lead to deadlock.

Equivalent example using explicit call to submit():

>>> batch = ts.Batch()
>>> read_future1 = store[:3].read(batch=batch)
>>> read_future2 = store[3:].read(batch=batch)
>>> batch.submit()
>>> await read_future1
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17]], dtype=uint16)
>>> await read_future2
array([[18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29]], dtype=uint16)

Equivalent example relying on implicit submit by the destructor when the last reference is released:

>>> batch = ts.Batch()
>>> read_future1 = store[:3].read(batch=batch)
>>> read_future2 = store[3:].read(batch=batch)
>>> del batch
>>> await read_future1
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17]], dtype=uint16)
>>> await read_future2
array([[18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29]], dtype=uint16)

Warning

Relying on this implicit submit behavior is not recommended and may result in the submit being delayed indefinitely, due to Python implicitly retaining a reference to the object, or due to a cyclic reference.

Constructors

Batch()

Creates a new batch.

Operations

submit() None

Submits the batch.