TensorStore

TensorStore is a library for efficiently reading and writing large multi-dimensional arrays.

Highlights

  • Provides a uniform API for reading and writing multiple array formats, including zarr, N5, and Neuroglancer precomputed.

  • Natively supports multiple storage drivers, including Google Cloud Storage, local and network filesystems, in-memory storage.

  • Support for read/writeback caching and transactions, with strong atomicity, isolation, consistency, and durability (ACID) guarantees.

  • Supports safe, efficient access from multiple processes and machines via optimistic concurrency.

  • High-performance implementation in C++ automatically takes advantage of multiple cores for encoding/decoding and performs multiple concurrent I/O operations to saturate network bandwidth.

  • Asynchronous API enables high-throughput access even to high-latency remote storage.

  • Advanced, fully composable indexing operations and virtual views.

Getting started

To get started using the Python API, start with the tutorial and indexing operation guide, then refer to the detailed Python API reference.

For setup instructions, refer to the Building and Installing section.

For details for using a particular driver, refer to the driver and key-value storage reference.

Concepts

The core abstraction, a TensorStore is an asynchronous view of a multi-dimensional array. Every TensorStore is backed by a driver, which connects the high-level TensorStore interface to an underlying data storage mechanism.

Opening or creating a TensorStore is done using a JSON Spec, which is analogous to a URL/file path/database connection string.

TensorStore introduces a new indexing abstraction, the Index transform, which underlies all indexing operations. All indexing operations result in virtual views and are fully composable. Dimension labels are also supported, and can be used in indexing operations through the dimension expression mechanism.

Shared resources like in-memory caches and concurrency limits are configured using the context mechanism.

Properties of a TensorStore, like the domain, data type, chunk layout, fill value, and encoding, can be queried and specified/constrained in a uniform way using the schema mechanism.