Building and Installing

There are several ways to build and install TensorStore, depending on the intended use case.

Python API

The TensorStore Python API requires Python 3.5 or later (Python 2 is not supported).

Installation from PyPI package

The Python bindings can be installed directly from the tensorstore PyPI package using pip. It is recommended to first create a virtual environment.

To install the latest published version, use:

# Use -vv option to show progress
python3 -m pip install tensorstore -vv

Note

On Windows, you may have to use instead:

py -3 -m pip install tensorstore -vv

This is the simplest and fastest way to install the TensorStore Python bindings if you aren’t intending to make changes to the TensorStore source code.

If a pre-built binary package is available for your specific platform and Python version, it will be used and no additional build tools are required. Otherwise, the package will be built from the source distribution and the normal build dependencies are required.

Installation from local checkout

If you intend to make changes to the TensorStore source code while simultaneously using TensorStore as a dependency, you can create a virtual environment and then install from a local checkout of the git repository:

git clone https://github.com/google/tensorstore
cd tensorstore
python3 setup.py develop

This invokes Bazel to build the TensorStore C++ extension module. You must have the required build dependencies.

After making changes to the C++ source code, you must re-run:

python3 setup.py develop

to rebuild the extension module. Rebuilds are incremental and will be much faster than the initial build.

Note that while it also works to invoke python3 -m pip install -e . or python3 -m pip install ., that will result in Bazel being invoked from a temporary copy of the source tree, which prevents incremental rebuilds.

The build is affected by the following environment variables:

TENSORSTORE_BAZELISK

Path to Bazelisk script that is invoked in order to execute the build. By default the bundled bazelisk.py is used, but this environment variable allows that to be overridden in order to pass additional options, etc.

BAZELISK_HOME

Path to cache directory used by Bazelisk for downloaded Bazel versions. Defaults to a platform-specific cache directory.

TENSORSTORE_BAZEL_COMPILATION_MODE

Bazel compilation mode to use. Defaults to opt (optimized build).

TENSORSTORE_BAZEL_STARTUP_OPTIONS

Additional Bazel startup options to specify when building. Multiple options may be separated by spaces; options containing spaces or other special characters should be encoded according to Posix shell escaping rules as implemented by shlex.split().

This may be used to specify a non-standard cache directory:

TENSORSTORE_BAZEL_STARTUP_OPTIONS="--output_user_root /path/to/bazel_cache"
TENSORSTORE_BAZEL_BUILD_OPTIONS

Additional Bazel build options to specify when building. The encoding is the same as for TENSORSTORE_BAZEL_STARTUP_OPTIONS.

TENSORSTORE_PREBUILT_DIR

If specified, building is skipped, and instead setup.py expects to find the pre-built extension module in the specified directory, from a prior invocation of build_ext:

python3 setup.py build_ext -b /tmp/prebuilt
TENSORSTORE_PREBUILT_DIR=/tmp/prebuilt pip wheel .

IPython shell without installing

python bazelisk.py run -c opt //python/tensorstore:shell

Publishing a PyPI package

To build a source package:

python3 setup.py sdist

To build a binary package:

python3 setup.py bdist_wheel

The packages are written to the dist/ sub-directory.

C++ API

Currently, use of the TensorStore C++ API is only supported from projects built using Bazel. CMake support will be added in the future.

To add TensorStore as a dependency to an existing Bazel workspace:

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
load("@bazel_tools//tools/build_defs/repo:utils.bzl", "maybe")

maybe(
    http_archive,
    name = "com_github_google_tensorstore",
    strip_prefix = "tensorstore-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    url = "https://github.com/google/tensorstore/archive/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX",
    sha256 = "YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY",
)

Additionally, TensorStore must be built in C++17 mode. You should add the compiler flags specified in the .bazelrc file in the TensorStore repository to your dependent project’s .bazelrc.

Development

For development of TensorStore, ensure that you have the required build dependencies.

Building the documentation

python bazelisk.py run //tools/docs:build_docs -- --output /tmp/tensorstore-docs

Running tests

python bazelisk.py test //...

Build dependencies

TensorStore is written in C++ and is compatible with the following C++ compilers:

  • GCC 9 or later (Linux)

  • Clang 8 or later (Linux)

  • Microsoft Visual Studio 2019 version 16.4 (MSVC 19.24) or later

  • Clang-cl 9 or later (Windows)

  • Apple Xcode 11.3.1 or later (earlier versions of XCode 11 have a code generation bug related to stack alignment)

TensorStore uses the Bazel build system. You don’t need to install Bazel manually; the included copy of bazelisk automatically downloads a suitable version for your operating system. Bazelisk requires Python to run.

Note

On macOS, starting with Python 3.6, installing Python using the installer from python.org does not automatically set up Python with the SSL/TLS certificates needed by bazelisk.

If you have not already done so, you need to run the /Applications/Python 3.x/Install Certificates.command script in your Python installation directory. Refer to the the documentation at /Applications/Python 3.x/ReadMe.rtf for more information.

TensorStore depends on a number of third-party libraries. By default, these dependencies are fetched and built automatically as part of the TensorStore build, which requires no additional effort.

On Linux and macOS, however, it is possible to override this behavior for a subset of these libraries and instead link to a system-provided version. This reduces the binary size, and if your system packages are kept up to date, ensures TensorStore uses up-to-date versions of these dependencies.

TENSORSTORE_SYSTEM_LIBS

To use system-provided libraries, set the TENSORSTORE_SYSTEM_LIBS environment variable to a comma-separated list of the following identifiers prior to invoking Bazel:

Required third-party libraries

Identifier

Bundled library

Version

com_google_boringssl

boringssl

bdbe37905216

org_sourceware_bzip2

bzip2

1.0.8

org_blosc_cblosc

c-blosc

1.21.0

se_curl

curl

7.74.0

jpeg

libjpeg-turbo

2.0.5

org_lz4

lz4

1.9.3

nasm

nasm

2.13.03

org_nghttp2

nghttp2

1.42.0

com_google_snappy

snappy

1.1.8

org_tukaani_xz

xz

5.2.5

net_zlib

zlib

1.2.11

com_facebook_zstd

zstd

1.4.8

For example, to run the tests using the system-provided curl, jpeg, and SSL libraries:

export TENSORSTORE_SYSTEM_LIBS=se_curl,jpeg,com_google_boringssl
python bazelisk.py test //...