tensorstore.virtual_chunked(read_function: Callable[[IndexDomain, ArrayLike, VirtualChunkedReadParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None = None, write_function: Callable[[IndexDomain, ArrayLike, VirtualChunkedWriteParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None = None, *, loop: asyncio.AbstractEventLoop | None = None, rank: int | None = None, dtype: dtype | None = None, domain: IndexDomain | None = None, shape: Sequence[int] | None = None, chunk_layout: ChunkLayout | None = None, dimension_units: Sequence[Unit | str | Real | tuple[Real, str] | None] | None = None, schema: Schema | None = None, context: Context | None = None, transaction: Transaction | None = None) TensorStore

Creates a TensorStore where the content is read/written chunk-wise by an arbitrary function.

Example (read-only):

>>> a = ts.array([[1, 2, 3], [4, 5, 6]], dtype=ts.uint32)
>>> async def do_read(domain: ts.IndexDomain, array: np.ndarray,
...                   read_params: ts.VirtualChunkedReadParameters):
...     print(f'Computing content for: {domain}')
...     array[...] = (await a[domain].read()) + 100
>>> t = ts.virtual_chunked(do_read, dtype=a.dtype, domain=a.domain)
>>> await t.read()
Computing content for: { [0, 2), [0, 3) }
array([[101, 102, 103],
       [104, 105, 106]], dtype=uint32)

Example (read/write):

>>> array = np.zeros(shape=[4, 5], dtype=np.uint32)
>>> array[1] = 50
>>> def do_read(domain, chunk, read_context):
...     chunk[...] = array[domain.index_exp]
>>> def do_write(domain, chunk, write_context):
...     array[domain.index_exp] = chunk
>>> t = ts.virtual_chunked(
...     do_read,
...     do_write,
...     dtype=array.dtype,
...     shape=array.shape,
...     chunk_layout=ts.ChunkLayout(read_chunk_shape=(2, 3)))
>>> await t.read()
array([[ 0,  0,  0,  0,  0],
       [50, 50, 50, 50, 50],
       [ 0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0]], dtype=uint32)
>>> t[1:3, 1:3] = 42
>>> array
array([[ 0,  0,  0,  0,  0],
       [50, 42, 42, 50, 50],
       [ 0, 42, 42,  0,  0],
       [ 0,  0,  0,  0,  0]], dtype=uint32)
Parameters:
read_function: Callable[[IndexDomain, ArrayLike, VirtualChunkedReadParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None = None

Callback that handles chunk read requests. Must be specified to create a virtual view that supports reads. To create a write-only view, leave this unspecified (as None).

This function should assign to the array the content for the specified IndexDomain.

The returned TimestampedStorageGeneration identifies the version of the content, for caching purposes. If versioning is not applicable, None may be returned to indicate a value that may be cached indefinitely.

If it returns a coroutine, the coroutine will be executed using the event loop indicated by loop.

write_function: Callable[[IndexDomain, ArrayLike, VirtualChunkedWriteParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None = None

Callback that handles chunk write requests. Must be specified to create a virtual view that supports writes. To create a read-only view, leave this unspecified (as None).

This function store the content of the array for the specified IndexDomain.

The returned TimestampedStorageGeneration identifies the stored version of the content, for caching purposes. If versioning is not applicable, None may be returned to indicate a value that may be cached indefinitely.

If it returns a coroutine, the coroutine will be executed using the event loop indicated by loop.

loop: asyncio.AbstractEventLoop | None = None

Event loop on which to execute read_function and/or write_function if they are async functions. If not specified (or None is specified), defaults to the loop returned by asyncio.get_running_loop (in the context of the call to virtual_chunked). If loop is not specified and there is no running event loop, it is an error for read_function or write_function to return a coroutine.

rank: int | None = None

Constrains the rank of the TensorStore. If there is an index transform, the rank constraint must match the rank of the input space.

dtype: dtype | None = None

Constrains the data type of the TensorStore. If a data type has already been set, it is an error to specify a different data type.

domain: IndexDomain | None = None

Constrains the domain of the TensorStore. If there is an existing domain, the specified domain is merged with it as follows:

  1. The rank must match the existing rank.

  2. All bounds must match, except that a finite or explicit bound is permitted to match an infinite and implicit bound, and takes precedence.

  3. If both the new and existing domain specify non-empty labels for a dimension, the labels must be equal. If only one of the domains specifies a non-empty label for a dimension, the non-empty label takes precedence.

Note that if there is an index transform, the domain must match the input space, not the output space.

shape: Sequence[int] | None = None

Constrains the shape and origin of the TensorStore. Equivalent to specifying a domain of ts.IndexDomain(shape=shape).

Note

This option also constrains the origin of all dimensions to be zero.

chunk_layout: ChunkLayout | None = None

Constrains the chunk layout. If there is an existing chunk layout constraint, the constraints are merged. If the constraints are incompatible, an error is raised.

dimension_units: Sequence[Unit | str | Real | tuple[Real, str] | None] | None = None

Specifies the physical units of each dimension of the domain.

The physical unit for a dimension is the physical quantity corresponding to a single index increment along each dimension.

A value of None indicates that the unit is unknown. A dimension-less quantity can be indicated by a unit of "".

schema: Schema | None = None

Additional schema constraints to merge with existing constraints.

context: Context | None = None

Shared resource context. Defaults to a new (unshared) context with default options, as returned by tensorstore.Context(). To share resources, such as cache pools, between multiple open TensorStores, you must specify a context.

transaction: Transaction | None = None

Transaction to use for opening/creating, and for subsequent operations. By default, the open is non-transactional.

Note

To perform transactional operations using a TensorStore that was previously opened without a transaction, use TensorStore.with_transaction.

Warning

Neither read_function nor write_function should block synchronously while waiting for another TensorStore operation; blocking on another operation that uses the same Context.data_copy_concurrency resource may result in deadlock. Instead, it is better to specify a coroutine function for read_function and write_function and use await to wait for the result of other TensorStore operations.

Caching

By default, the computed content of chunks is not cached, and will be recomputed on every read. To enable caching:

Pickle support

The returned TensorStore supports pickling if, and only if, the read_function and write_function support pickling.

Note

The pickle module only supports global functions defined in named modules. For broader function support, you may wish to use cloudpickle.

Warning

The specified loop is not preserved when the returned TensorStore is pickled, since it is a property of the current thread. Instead, when unpickled, the resultant TensorStore will use the running event loop (as returned by asyncio.get_running_loop) of the thread used for unpickling, if there is one.

Transaction support

Transactional reads and writes are supported on virtual_chunked views. A transactional write simply serves to buffer the write in memory until it is committed. Transactional reads will observe prior writes made using the same transaction. However, when the transaction commit is initiated, the write_function is called in exactly the same way as for a non-transactional write, and if more than one chunk is affected, the commit will be non-atomic. If the transaction is atomic, it is an error to write to more than one chunk in the same transaction.

You are also free to use transactional operations, e.g. operations on a KvStore or another TensorStore, within the read_function or write_function.

  • For read-write views, you should not attempt to use the same transaction within the read_function or write_function that is also used for read or write operations on the virtual view directly, because both write_function and read_function may be called after the commit starts, and any attempt to perform new operations using the same transaction once it is already being committed will fail; instead, any transactional operations performed within the read_function or write_function should use a different transaction.

  • For read-only views, it is possible to use the same transaction within the read_function as is also used for read operations on the virtual view directly, though this may not be particularly useful.

Specifying a transaction directly when creating the virtual chunked view is no different than binding the transaction to an existing virtual chunked view.