-
tensorstore.virtual_chunked(read_function: Callable[[IndexDomain, ArrayLike, VirtualChunkedReadParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None =
None
, write_function: Callable[[IndexDomain, ArrayLike, VirtualChunkedWriteParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None =None
, *, loop: asyncio.AbstractEventLoop | None =None
, rank: int | None =None
, dtype: dtype | None =None
, domain: IndexDomain | None =None
, shape: Sequence[int] | None =None
, chunk_layout: ChunkLayout | None =None
, dimension_units: Sequence[Unit | str | Real | tuple[Real, str] | None] | None =None
, schema: Schema | None =None
, context: Context | None =None
, transaction: Transaction | None =None
) TensorStore Creates a
TensorStore
where the content is read/written chunk-wise by an arbitrary function.Example (read-only):
>>> a = ts.array([[1, 2, 3], [4, 5, 6]], dtype=ts.uint32) >>> async def do_read(domain: ts.IndexDomain, array: np.ndarray, ... read_params: ts.VirtualChunkedReadParameters): ... print(f'Computing content for: {domain}') ... array[...] = (await a[domain].read()) + 100 >>> t = ts.virtual_chunked(do_read, dtype=a.dtype, domain=a.domain) >>> await t.read() Computing content for: { [0, 2), [0, 3) } array([[101, 102, 103], [104, 105, 106]], dtype=uint32)
Example (read/write):
>>> array = np.zeros(shape=[4, 5], dtype=np.uint32) >>> array[1] = 50 >>> def do_read(domain, chunk, read_context): ... chunk[...] = array[domain.index_exp] >>> def do_write(domain, chunk, write_context): ... array[domain.index_exp] = chunk >>> t = ts.virtual_chunked( ... do_read, ... do_write, ... dtype=array.dtype, ... shape=array.shape, ... chunk_layout=ts.ChunkLayout(read_chunk_shape=(2, 3))) >>> await t.read() array([[ 0, 0, 0, 0, 0], [50, 50, 50, 50, 50], [ 0, 0, 0, 0, 0], [ 0, 0, 0, 0, 0]], dtype=uint32) >>> t[1:3, 1:3] = 42 >>> array array([[ 0, 0, 0, 0, 0], [50, 42, 42, 50, 50], [ 0, 42, 42, 0, 0], [ 0, 0, 0, 0, 0]], dtype=uint32)
- Parameters:¶
- read_function: Callable[[IndexDomain, ArrayLike, VirtualChunkedReadParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None =
None
¶ Callback that handles chunk read requests. Must be specified to create a virtual view that supports reads. To create a write-only view, leave this unspecified (as
None
).This function should assign to the array the content for the specified
IndexDomain
.The returned
TimestampedStorageGeneration
identifies the version of the content, for caching purposes. If versioning is not applicable,None
may be returned to indicate a value that may be cached indefinitely.If it returns a coroutine, the coroutine will be executed using the event loop indicated by
loop
.- write_function: Callable[[IndexDomain, ArrayLike, VirtualChunkedWriteParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None =
None
¶ Callback that handles chunk write requests. Must be specified to create a virtual view that supports writes. To create a read-only view, leave this unspecified (as
None
).This function store the content of the array for the specified
IndexDomain
.The returned
TimestampedStorageGeneration
identifies the stored version of the content, for caching purposes. If versioning is not applicable,None
may be returned to indicate a value that may be cached indefinitely.If it returns a coroutine, the coroutine will be executed using the event loop indicated by
loop
.- loop: asyncio.AbstractEventLoop | None =
None
¶ Event loop on which to execute
read_function
and/orwrite_function
if they are async functions. If not specified (orNone
is specified), defaults to the loop returned byasyncio.get_running_loop
(in the context of the call tovirtual_chunked
). Ifloop
is not specified and there is no running event loop, it is an error forread_function
orwrite_function
to return a coroutine.- rank: int | None =
None
¶ Constrains the rank of the TensorStore. If there is an index transform, the rank constraint must match the rank of the input space.
- dtype: dtype | None =
None
¶ Constrains the data type of the TensorStore. If a data type has already been set, it is an error to specify a different data type.
- domain: IndexDomain | None =
None
¶ Constrains the domain of the TensorStore. If there is an existing domain, the specified domain is merged with it as follows:
The rank must match the existing rank.
All bounds must match, except that a finite or explicit bound is permitted to match an infinite and implicit bound, and takes precedence.
If both the new and existing domain specify non-empty labels for a dimension, the labels must be equal. If only one of the domains specifies a non-empty label for a dimension, the non-empty label takes precedence.
Note that if there is an index transform, the domain must match the input space, not the output space.
- shape: Sequence[int] | None =
None
¶ Constrains the shape and origin of the TensorStore. Equivalent to specifying a
domain
ofts.IndexDomain(shape=shape)
.Note
This option also constrains the origin of all dimensions to be zero.
- chunk_layout: ChunkLayout | None =
None
¶ Constrains the chunk layout. If there is an existing chunk layout constraint, the constraints are merged. If the constraints are incompatible, an error is raised.
- dimension_units: Sequence[Unit | str | Real | tuple[Real, str] | None] | None =
None
¶ Specifies the physical units of each dimension of the domain.
The physical unit for a dimension is the physical quantity corresponding to a single index increment along each dimension.
A value of
None
indicates that the unit is unknown. A dimension-less quantity can be indicated by a unit of""
.- schema: Schema | None =
None
¶ Additional schema constraints to merge with existing constraints.
- context: Context | None =
None
¶ Shared resource context. Defaults to a new (unshared) context with default options, as returned by
tensorstore.Context()
. To share resources, such as cache pools, between multiple open TensorStores, you must specify a context.- transaction: Transaction | None =
None
¶ Transaction to use for opening/creating, and for subsequent operations. By default, the open is non-transactional.
Note
To perform transactional operations using a TensorStore that was previously opened without a transaction, use
TensorStore.with_transaction
.
- read_function: Callable[[IndexDomain, ArrayLike, VirtualChunkedReadParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None =
Warning
Neither
read_function
norwrite_function
should block synchronously while waiting for another TensorStore operation; blocking on another operation that uses the sameContext.data_copy_concurrency
resource may result in deadlock. Instead, it is better to specify a coroutine function forread_function
andwrite_function
and use await to wait for the result of other TensorStore operations.Caching¶
By default, the computed content of chunks is not cached, and will be recomputed on every read. To enable caching:
Specify a
Context
that contains acache_pool
with a non-zero size limit, e.g.:{"cache_pool": {"total_bytes_limit": 100000000}}
for 100MB.Additionally, if the data is not immutable, the
read_function
should return a unique generation and a timestamp that is notfloat('inf')
. When a cached chunk is re-read, theread_function
will be called withif_not_equal
specified. If the generation specified byif_not_equal
is still current, theread_function
may leave the output array unmodified and return aTimestampedStorageGeneration
with an appropriatetime
butgeneration
left unspecified.
Pickle support¶
The returned
TensorStore
supports pickling if, and only if, theread_function
andwrite_function
support pickling.Note
The
pickle
module only supports global functions defined in named modules. For broader function support, you may wish to use cloudpickle.Warning
The specified
loop
is not preserved when the returnedTensorStore
is pickled, since it is a property of the current thread. Instead, when unpickled, the resultantTensorStore
will use the running event loop (as returned byasyncio.get_running_loop
) of the thread used for unpickling, if there is one.Transaction support¶
Transactional reads and writes are supported on virtual_chunked views. A transactional write simply serves to buffer the write in memory until it is committed. Transactional reads will observe prior writes made using the same transaction. However, when the transaction commit is initiated, the
write_function
is called in exactly the same way as for a non-transactional write, and if more than one chunk is affected, the commit will be non-atomic. If the transaction is atomic, it is an error to write to more than one chunk in the same transaction.You are also free to use transactional operations, e.g. operations on a
KvStore
or anotherTensorStore
, within theread_function
orwrite_function
.For read-write views, you should not attempt to use the same transaction within the
read_function
orwrite_function
that is also used for read or write operations on the virtual view directly, because bothwrite_function
andread_function
may be called after the commit starts, and any attempt to perform new operations using the same transaction once it is already being committed will fail; instead, any transactional operations performed within theread_function
orwrite_function
should use a different transaction.For read-only views, it is possible to use the same transaction within the
read_function
as is also used for read operations on the virtual view directly, though this may not be particularly useful.
Specifying a transaction directly when creating the virtual chunked view is no different than binding the transaction to an existing virtual chunked view.