-
tensorstore.virtual_chunked(read_function: Callable[[IndexDomain, numpy.ndarray, VirtualChunkedReadParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None =
None, write_function: Callable[[IndexDomain, numpy.ndarray, VirtualChunkedWriteParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None =None, *, loop: asyncio.AbstractEventLoop | None =None, rank: int | None =None, dtype: DTypeLike | None =None, domain: IndexDomain | None =None, shape: Iterable[int] | None =None, chunk_layout: ChunkLayout | None =None, dimension_units: Iterable[Unit | str | Real | tuple[Real, str] | None] | None =None, schema: Schema | None =None, context: Context | None =None, transaction: Transaction | None =None) TensorStore Creates a
TensorStorewhere the content is read/written chunk-wise by an arbitrary function.Example (read-only):
>>> a = ts.array([[1, 2, 3], [4, 5, 6]], dtype=ts.uint32) >>> async def do_read(domain: ts.IndexDomain, array: np.ndarray, ... read_params: ts.VirtualChunkedReadParameters): ... print(f'Computing content for: {domain}') ... array[...] = (await a[domain].read()) + 100 >>> t = ts.virtual_chunked(do_read, dtype=a.dtype, domain=a.domain) >>> await t.read() Computing content for: { [0, 2), [0, 3) } array([[101, 102, 103], [104, 105, 106]], dtype=uint32)Example (read/write):
>>> array = np.zeros(shape=[4, 5], dtype=np.uint32) >>> array[1] = 50 >>> def do_read(domain, chunk, read_context): ... chunk[...] = array[domain.index_exp] >>> def do_write(domain, chunk, write_context): ... array[domain.index_exp] = chunk >>> t = ts.virtual_chunked( ... do_read, ... do_write, ... dtype=array.dtype, ... shape=array.shape, ... chunk_layout=ts.ChunkLayout(read_chunk_shape=(2, 3))) >>> await t.read() array([[ 0, 0, 0, 0, 0], [50, 50, 50, 50, 50], [ 0, 0, 0, 0, 0], [ 0, 0, 0, 0, 0]], dtype=uint32) >>> t[1:3, 1:3] = 42 >>> array array([[ 0, 0, 0, 0, 0], [50, 42, 42, 50, 50], [ 0, 42, 42, 0, 0], [ 0, 0, 0, 0, 0]], dtype=uint32)- Parameters:¶
- read_function: Callable[[IndexDomain, numpy.ndarray, VirtualChunkedReadParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None =
None¶ Callback that handles chunk read requests. Must be specified to create a virtual view that supports reads. To create a write-only view, leave this unspecified (as
None).This function should assign to the array the content for the specified
IndexDomain.The returned
TimestampedStorageGenerationidentifies the version of the content, for caching purposes. If versioning is not applicable,Nonemay be returned to indicate a value that may be cached indefinitely.If it returns a coroutine, the coroutine will be executed using the event loop indicated by
loop.- write_function: Callable[[IndexDomain, numpy.ndarray, VirtualChunkedWriteParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None =
None¶ Callback that handles chunk write requests. Must be specified to create a virtual view that supports writes. To create a read-only view, leave this unspecified (as
None).This function store the content of the array for the specified
IndexDomain.The returned
TimestampedStorageGenerationidentifies the stored version of the content, for caching purposes. If versioning is not applicable,Nonemay be returned to indicate a value that may be cached indefinitely.If it returns a coroutine, the coroutine will be executed using the event loop indicated by
loop.- loop: asyncio.AbstractEventLoop | None =
None¶ Event loop on which to execute
read_functionand/orwrite_functionif they are async functions. If not specified (orNoneis specified), defaults to the loop returned byasyncio.get_running_loop(in the context of the call tovirtual_chunked). Ifloopis not specified and there is no running event loop, it is an error forread_functionorwrite_functionto return a coroutine.- rank: int | None =
None¶ Constrains the rank of the TensorStore. If there is an index transform, the rank constraint must match the rank of the input space.
- dtype: DTypeLike | None =
None¶ Constrains the data type of the TensorStore. If a data type has already been set, it is an error to specify a different data type.
- domain: IndexDomain | None =
None¶ Constrains the domain of the TensorStore. If there is an existing domain, the specified domain is merged with it as follows:
The rank must match the existing rank.
All bounds must match, except that a finite or explicit bound is permitted to match an infinite and implicit bound, and takes precedence.
If both the new and existing domain specify non-empty labels for a dimension, the labels must be equal. If only one of the domains specifies a non-empty label for a dimension, the non-empty label takes precedence.
Note that if there is an index transform, the domain must match the input space, not the output space.
- shape: Iterable[int] | None =
None¶ Constrains the shape and origin of the TensorStore. Equivalent to specifying a
domainofts.IndexDomain(shape=shape).Note
This option also constrains the origin of all dimensions to be zero.
- chunk_layout: ChunkLayout | None =
None¶ Constrains the chunk layout. If there is an existing chunk layout constraint, the constraints are merged. If the constraints are incompatible, an error is raised.
- dimension_units: Iterable[Unit | str | Real | tuple[Real, str] | None] | None =
None¶ Specifies the physical units of each dimension of the domain.
The physical unit for a dimension is the physical quantity corresponding to a single index increment along each dimension.
A value of
Noneindicates that the unit is unknown. A dimension-less quantity can be indicated by a unit of"".- schema: Schema | None =
None¶ Additional schema constraints to merge with existing constraints.
- context: Context | None =
None¶ Shared resource context. Defaults to a new (unshared) context with default options, as returned by
tensorstore.Context(). To share resources, such as cache pools, between multiple open TensorStores, you must specify a context.- transaction: Transaction | None =
None¶ Transaction to use for opening/creating, and for subsequent operations. By default, the open is non-transactional.
Note
To perform transactional operations using a TensorStore that was previously opened without a transaction, use
TensorStore.with_transaction.
- read_function: Callable[[IndexDomain, numpy.ndarray, VirtualChunkedReadParameters], FutureLike[KvStore.TimestampedStorageGeneration | None]] | None =
Warning
Neither
read_functionnorwrite_functionshould block synchronously while waiting for another TensorStore operation; blocking on another operation that uses the sameContext.data_copy_concurrencyresource may result in deadlock. Instead, it is better to specify a coroutine function forread_functionandwrite_functionand use await to wait for the result of other TensorStore operations.Caching¶
By default, the computed content of chunks is not cached, and will be recomputed on every read. To enable caching:
Specify a
Contextthat contains acache_poolwith a non-zero size limit, e.g.:{"cache_pool": {"total_bytes_limit": 100000000}}for 100MB.Additionally, if the data is not immutable, the
read_functionshould return a unique generation and a timestamp that is notfloat('inf'). When a cached chunk is re-read, theread_functionwill be called withif_not_equalspecified. If the generation specified byif_not_equalis still current, theread_functionmay leave the output array unmodified and return aTimestampedStorageGenerationwith an appropriatetimebutgenerationleft unspecified.
Pickle support¶
The returned
TensorStoresupports pickling if, and only if, theread_functionandwrite_functionsupport pickling.Note
The
picklemodule only supports global functions defined in named modules. For broader function support, you may wish to use cloudpickle.Warning
The specified
loopis not preserved when the returnedTensorStoreis pickled, since it is a property of the current thread. Instead, when unpickled, the resultantTensorStorewill use the running event loop (as returned byasyncio.get_running_loop) of the thread used for unpickling, if there is one.Transaction support¶
Transactional reads and writes are supported on virtual_chunked views. A transactional write simply serves to buffer the write in memory until it is committed. Transactional reads will observe prior writes made using the same transaction. However, when the transaction commit is initiated, the
write_functionis called in exactly the same way as for a non-transactional write, and if more than one chunk is affected, the commit will be non-atomic. If the transaction is atomic, it is an error to write to more than one chunk in the same transaction.You are also free to use transactional operations, e.g. operations on a
KvStoreor anotherTensorStore, within theread_functionorwrite_function.For read-write views, you should not attempt to use the same transaction within the
read_functionorwrite_functionthat is also used for read or write operations on the virtual view directly, because bothwrite_functionandread_functionmay be called after the commit starts, and any attempt to perform new operations using the same transaction once it is already being committed will fail; instead, any transactional operations performed within theread_functionorwrite_functionshould use a different transaction.For read-only views, it is possible to use the same transaction within the
read_functionas is also used for read operations on the virtual view directly, though this may not be particularly useful.
Specifying a transaction directly when creating the virtual chunked view is no different than binding the transaction to an existing virtual chunked view.