Indexing

tensorstore.TensorStore (and objects of other tensorstore.Indexable types) support a common set of indexing operations for read/write access to individual positions and subsets of positions. In addition to full support for NumPy-style basic and advanced indexing, dimension expressions provide additional indexing capabilities integrated with TensorStore’s support for labeled/named dimensions and non-zero origins.

Note

In TensorStore, all indexing operations result in a (read/write) view of the original object, represented as a new object of the same type with a different tensorstore.IndexDomain. Indexing operations never implicitly perform I/O or copy data. This differs from NumPy indexing, where basic indexing results in a view of the original data, but advanced indexing always results in a copy.

NumPy-style indexing

NumPy-style indexing is performed using the syntax obj[expr], where obj is any tensorstore.Indexable object and the indexing expression expr is one of:

an integer;

Integer indexing

a slice object start:stop:step, e.g. obj[:] or obj[3:5] or obj[1:7:2], where the start, stop, or step values are each None, integers or sequences of integer or None values;

Interval indexing

tensorstore.newaxis or None;

Adding singleton dimensions

... or Ellipsis;

Ellipsis

array_like with integer data type;

Integer array indexing

array_like with bool data type;

Boolean array indexing

tuple of any of the above, e.g. obj[1, 2, :, 3] or obj[1, ..., :, [0, 2, 3]].

This form of indexing always operates on a prefix of the dimensions, consuming dimensions from the existing domain and adding dimensions to the resultant domain in order; if the indexing expression consumes fewer than obj.rank dimensions, the remaining dimensions are retained unchanged as if indexed by :.

Integer indexing

Indexing with an integer selects a single position within the corresponding dimension:

>>> x = ts.array([[0, 1, 2], [3, 4, 5]], dtype=ts.int32)
>>> x[1]
TensorStore({
  'array': [3, 4, 5],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},
})
>>> x[1, 2]
TensorStore({
  'array': 5,
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_rank': 0},
})

Each integer index consumes a single dimension from the original domain and adds no dimensions to the result domain.

Because TensorStore supports index domains defined over negative indices, negative values have no special meaning; they simply refer to negative positions:

>>> x = await ts.open({
...     "dtype": "int32",
...     "driver": "array",
...     "array": [1, 2, 3],
...     "transform": {
...         "input_shape": [3],
...         "input_inclusive_min": [-10],
...         "output": [{
...             "input_dimension": 0,
...             "offset": 10
...         }],
...     },
... })
>>> x[-10]
TensorStore({
  'array': 1,
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_rank': 0},
})

Warning

This differs from the behavior of the built-in sequence types and numpy.ndarray, where a negative index specifies a position relative to the end (upper bound).

Specifying an index outside the explicit bounds of a dimension results in an immediate error:

>>> x = ts.array([0, 1, 2, 3], dtype=ts.int32)
>>> x[4]
Traceback (most recent call last):
    ...
IndexError: Checking bounds of constant output index map for dimension 0: Index 4 is outside valid range [0, 4)

Specifying an index outside the implicit bounds of a dimension is permitted:

>>> y = ts.IndexTransform(input_shape=[4], implicit_lower_bounds=[True])
>>> y[-1]
Rank 0 -> 1 index space transform:
  Input domain:
  Output index maps:
    out[0] = -1
>>> y[4]
Traceback (most recent call last):
    ...
IndexError: Checking bounds of constant output index map for dimension 0: Index 4 is outside valid range (-inf, 4)

While implicit bounds do not constrain indexing operations, the bounds will still be checked by any subsequent read or write operation, which will fail if any index is actually out of bounds.

Note

In addition to the int type, integer indices may be specified using any object that supports the __index__ protocol (PEP 357), including NumPy integer scalar types.

Interval indexing

Indexing with a slice object start:stop:step selects an interval or strided interval within the corresponding dimension:

>>> x = ts.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=ts.int32)
>>> x[1:5]
TensorStore({
  'array': [1, 2, 3, 4],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [5],
    'input_inclusive_min': [1],
    'output': [{'input_dimension': 0, 'offset': -1}],
  },
})

As for the built-in sequence types, the start value is inclusive while the stop value is exclusive.

Each of start, stop, and step may be an integer, None, or omitted (equivalent to specifying None). Specifying None for start or stop retains the existing lower or upper bound, respectively, for the dimension. Specifying None for step is equivalent to specifying 1.

When the step is 1, the domain of the resulting sliced dimension is not translated to have an origin of zero; instead, it has an origin equal to the start position of the interval (or the existing origin of the start position is unspecified):

>>> x[1:5][2]
TensorStore({
  'array': 2,
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_rank': 0},
})

If the step is not 1, the origin of the resulting sliced dimension is equal to the start position divided by the step value, rounded towards zero:

>>> x[3:8:2]
TensorStore({
  'array': [3, 5, 7],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [4],
    'input_inclusive_min': [1],
    'output': [{'input_dimension': 0, 'offset': -1}],
  },
})
>>> x[7:3:-2]
TensorStore({
  'array': [7, 5],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [-1],
    'input_inclusive_min': [-3],
    'output': [{'input_dimension': 0, 'offset': 3}],
  },
})

It is an error to specify an interval outside the explicit bounds of a dimension:

>>> x[3:12]
Traceback (most recent call last):
    ...
IndexError: Computing interval slice for dimension 0: Slice interval [3, 12) is not contained within domain [0, 10)

Warning

This behavior differs from that of the built-in sequence types and numpy.ndarray, where any out-of-bounds indices within the interval are silently skipped.

Specifying an interval outside the implicit bounds of a dimension is permitted:

>>> y = ts.IndexTransform(input_shape=[4], implicit_lower_bounds=[True])
>>> y[-1:2]
Rank 1 -> 1 index space transform:
  Input domain:
    0: [-1, 2)
  Output index maps:
    out[0] = 0 + 1 * in[0]

If a non-None value is specified for start or stop, the lower or upper bound, respectively, of the resultant dimension will be marked explicit. If None is specified for start or stop, the lower or upper bound, respectively, of the resultant dimension will be marked explicit if the corresponding original bound is marked explicit.

As with integer indexing, negative start or stop values have no special meaning, and simply indicate negative positions.

Any of the start, stop, or stop values may be specified as a sequence of integer or None values (e.g. a list, tuple or 1-d numpy.ndarray), rather than a single integer:

>>> x = ts.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]],
...              dtype=ts.int32)
>>> x[(1, 1):(3, 4)]
TensorStore({
  'array': [[6, 7, 8], [10, 11, 12]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [3, 4],
    'input_inclusive_min': [1, 1],
    'output': [
      {'input_dimension': 0, 'offset': -1},
      {'input_dimension': 1, 'offset': -1},
    ],
  },
})

This is equivalent to specifying a sequence of slice objects:

>>> x[1:3, 1:4]
TensorStore({
  'array': [[6, 7, 8], [10, 11, 12]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [3, 4],
    'input_inclusive_min': [1, 1],
    'output': [
      {'input_dimension': 0, 'offset': -1},
      {'input_dimension': 1, 'offset': -1},
    ],
  },
})

It is an error to specify a slice with sequences of unequal lengths, but a sequence may be combined with a scalar value:

>>> x[1:(3, 4)]
TensorStore({
  'array': [[6, 7, 8], [10, 11, 12]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [3, 4],
    'input_inclusive_min': [1, 1],
    'output': [
      {'input_dimension': 0, 'offset': -1},
      {'input_dimension': 1, 'offset': -1},
    ],
  },
})

Adding singleton dimensions

Specifying a value of tensorstore.newaxis (equal to None) adds a new dummy/singleton dimension with implicit bounds \([0, 1)\):

>>> x = ts.IndexTransform(input_rank=2)
>>> x[ts.newaxis]
Rank 3 -> 2 index space transform:
  Input domain:
    0: [0*, 1*)
    1: (-inf*, +inf*)
    2: (-inf*, +inf*)
  Output index maps:
    out[0] = 0 + 1 * in[1]
    out[1] = 0 + 1 * in[2]

This indexing term consumes no dimensions from the original domain and adds a single dimension after any dimensions added by prior indexing operations:

>>> x[:, ts.newaxis, ts.newaxis]
Rank 4 -> 2 index space transform:
  Input domain:
    0: (-inf*, +inf*)
    1: [0*, 1*)
    2: [0*, 1*)
    3: (-inf*, +inf*)
  Output index maps:
    out[0] = 0 + 1 * in[0]
    out[1] = 0 + 1 * in[3]

Because the added dimension has implicit bounds, it may be given arbitrary bounds by a subsequent interval indexing term:

>>> x[ts.newaxis][3:10]
Rank 3 -> 2 index space transform:
  Input domain:
    0: [3, 10)
    1: (-inf*, +inf*)
    2: (-inf*, +inf*)
  Output index maps:
    out[0] = 0 + 1 * in[1]
    out[1] = 0 + 1 * in[2]

Ellipsis

Specifying the special Ellipsis value (...) is equivalent to specifying as many full slices : as needed to consume the remaining dimensions of the original domin not consumed by other indexing terms:

>>> x = ts.array([[[1, 2, 3], [4, 5, 6]]], dtype=ts.int32)
>>> x[..., 1]
TensorStore({
  'array': [2, 5],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [1, 2],
    'input_inclusive_min': [0, 0],
    'output': [{'input_dimension': 1}],
  },
})

At most one Ellipsis may be specified within a single NumPy-style indexing expression:

>>> x[..., 1, ...]
Traceback (most recent call last):
    ...
IndexError: An index can only have a single ellipsis (`...`)

As a complete indexing expression , Ellipsis has no effect and is equivalent to the empty tuple (), but can still be useful for the purpose of an assignment:

>>> x = ts.array([0, 1, 2, 3], dtype=ts.int32)
>>> x[...] = 7
>>> x
TensorStore({
  'array': [7, 7, 7, 7],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [4], 'input_inclusive_min': [0]},
})

Integer array indexing

Specifying an array_like index array of integer values selects the coordinates of the dimension given by the elements of the array:

>>> x = ts.array([5, 4, 3, 2], dtype=ts.int32)
>>> x[[0, 3, 3]]
TensorStore({
  'array': [5, 2, 2],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},
})
>>> x[[[0, 1], [2, 3]]]
TensorStore({
  'array': [[5, 4], [3, 2]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},
})

This indexing term consumes a single dimension from the original domain, and when the full indexing expression involves just a single array indexing term, adds the dimensions of the index array to the result domain.

As with integer and interval indexing, and unlike NumPy, negative values in an index array have no special meaning, and simply indicate negative positions.

When a single indexing expression includes multiple index arrays, vectorized array indexing semantics apply by default: the shapes of the index arrays must all be broadcast-compatible, and the dimensions of the single broadcasted domain are added to the result domain:

>>> x = ts.array([[1, 2], [3, 4], [5, 6]], dtype=ts.int32)
>>> x[[0, 1, 2], [0, 1, 0]]
TensorStore({
  'array': [1, 4, 5],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},
})
>>> x[[[0, 1], [2, 2]], [[0, 1], [1, 0]]]
TensorStore({
  'array': [[1, 4], [6, 5]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},
})
>>> x[[[0, 1], [2, 2]], [0, 1]]
TensorStore({
  'array': [[1, 4], [5, 6]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},
})

If all of the index arrays are applied to consecutive dimensions without any interleaved slice, Ellipsis, or tensorstore.newaxis terms (interleaved integer index terms are permitted), then by default legacy NumPy semantics are used: the dimensions of the broadcasted array domain are added inline to the result domain after any dimensions added by prior indexing terms in the indexing expression:

>>> x = ts.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]], dtype=ts.int32)
>>> x[:, [1, 0], [1, 1]]
TensorStore({
  'array': [[4, 2], [8, 6]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},
})

If there are any interleaved slice, Ellipsis, or tensorstore.newaxis terms, then instead the dimensions of the broadcasted array domain are added as the first dimensions of the result domain:

>>> x[:, [1, 0], ts.newaxis, [1, 1]]
TensorStore({
  'array': [[4, 8], [2, 6]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [2, 2, [1]],
    'input_inclusive_min': [0, 0, [0]],
    'output': [{'input_dimension': 0}, {'input_dimension': 1}],
  },
})

To ensure that the added array domain dimensions are added as the first dimensions of the result domain regardless of whether there are any interleaved slice, Ellipsis, or tensorstore.newaxis terms, use the vindex indexing method.

To instead perform outer array indexing, where each index array is applied orthogonally, use the oindex indexing method.

Note

The legacy NumPy indexing behavior, whereby array domain dimensions are added either inline or as the first dimensions depending on whether the index arrays are applied to consecutive dimensions, is the default behavior for compatibility with NumPy but may be confusing. It is recommended to instead use either the vindex or oindex indexing method for less confusing behavior when using multiple index arrays.

Boolean array indexing

Specifying an array_like of bool values is equivalent to specifying a sequence of integer index arrays containing the coordinates of True values (in C order), e.g. as obtained from numpy.nonzero.

Specifying a 1-d bool array is equivalent to a single index array of the non-zero coordinates:

>>> x = ts.array([0, 1, 2, 3, 4], dtype=ts.int32)
>>> x[[True, False, True, True]]
TensorStore({
  'array': [0, 2, 3],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},
})
>>> # equivalent, using index array
>>> x[[0, 2, 3]]
TensorStore({
  'array': [0, 2, 3],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},
})

More generally, specifying an n-dimensional bool array is equivalent to specifying n index arrays, where the ith index array specifies the ith coordinate of the True values:

>>> x = ts.array([[0, 1, 2], [3, 4, 5]], dtype=ts.int32)
>>> x[[[True, False, False], [True, True, False]]]
TensorStore({
  'array': [0, 3, 4],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},
})
>>> # equivalent, using index arrays
>>> x[[0, 1, 1], [0, 0, 1]]
TensorStore({
  'array': [0, 3, 4],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [3], 'input_inclusive_min': [0]},
})

This indexing term consumes n dimensions from the original domain, where n is the rank of the bool array.

It is perfectly valid to mix boolean array indexing with other forms of indexing, including integer array indexing, with exactly the same result as if the boolean array were replaced by the equivalent sequence of integer index arrays:

>>> x = ts.array([[0, 1, 2], [3, 4, 5], [7, 8, 9]], dtype=ts.int32)
>>> x[[True, False, True], [2, 1]]
TensorStore({
  'array': [2, 8],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [2], 'input_inclusive_min': [0]},
})
>>> # equivalent, using index array
>>> x[[0, 2], [2, 1]]
TensorStore({
  'array': [2, 8],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [2], 'input_inclusive_min': [0]},
})

Warning

Mixing boolean and integer index arrays in the default vectorized indexing mode, while supported for compatibility with NumPy, is likely to be confusing. In most cases of mixed boolean and integer array indexing, outer indexing mode provides more useful behavior.

The scalar values True and False are treated as zero-rank boolean arrays. Zero-rank boolean arrays are supported, but there is no equivalent integer index array representation. If there are no other integer or boolean arrays, specifying a zero-rank boolean array is equivalent to specifying tensorstore.newaxis, except that the added dimension has explicit rather than implicit bounds, and in the case of a False array the added dimension has the empty bounds of \([0, 0)\):

>>> x = ts.IndexTransform(input_rank=2)
>>> x[:, True]
Rank 3 -> 2 index space transform:
  Input domain:
    0: (-inf*, +inf*)
    1: [0, 1)
    2: (-inf*, +inf*)
  Output index maps:
    out[0] = 0 + 1 * in[0]
    out[1] = 0 + 1 * in[2]
>>> x[:, False]
Rank 3 -> 2 index space transform:
  Input domain:
    0: (-inf*, +inf*)
    1: [0, 0)
    2: (-inf*, +inf*)
  Output index maps:
    out[0] = 0 + 1 * in[0]
    out[1] = 0 + 1 * in[2]

If there are other integer or boolean arrays, specifying a zero-rank boolean array has no effect except that:

  1. the other index array shapes must be broadcast-compatible with the shape [0] in the case of a False zero-rank array, meaning they are all empty arrays (in the case of a True zero-rank array, the other index array shapes must be broadcast-compatible with the shape [1], which is always satisfied);

  2. in legacy NumPy indexing mode, if it is separated from another integer or boolean array term by a slice, Ellipsis, or tensorstore.newaxis, it causes the dimensions of the broadcast array domain to be added as the first dimensions of the result domain:

>>> # Index array dimension added to result domain inline
>>> x[:, True, [0, 1]]
Rank 2 -> 2 index space transform:
  Input domain:
    0: (-inf*, +inf*)
    1: [0, 2)
  Output index maps:
    out[0] = 0 + 1 * in[0]
    out[1] = 0 + 1 * bounded((-inf, +inf), array(in)), where array =
      {{0, 1}}
>>> x[:, False, []]
Rank 2 -> 2 index space transform:
  Input domain:
    0: (-inf*, +inf*)
    1: [0, 0)
  Output index maps:
    out[0] = 0 + 1 * in[0]
    out[1] = 0
>>> # Index array dimensions added as first dimension of result domain
>>> x[True, :, [0, 1]]
Rank 2 -> 2 index space transform:
  Input domain:
    0: [0, 2)
    1: (-inf*, +inf*)
  Output index maps:
    out[0] = 0 + 1 * in[1]
    out[1] = 0 + 1 * bounded((-inf, +inf), array(in)), where array =
      {{0}, {1}}
>>> x[False, :, []]
Rank 2 -> 2 index space transform:
  Input domain:
    0: [0, 0)
    1: (-inf*, +inf*)
  Output index maps:
    out[0] = 0 + 1 * in[1]
    out[1] = 0

Note

Zero-rank boolean arrays are supported for consistency and for compatibility with NumPy, but are rarely useful.

Differences compared to NumPy indexing

TensorStore indexing has near-perfect compatibility with NumPy, but there are a few differences to be aware of:

  • Negative indices have no special meaning in TensorStore, and simply refer to negative positions. TensorStore does not support an equivalent shortcut syntax to specify a position n relative to the upper bound of a dimension; instead, it must be specified explicitly, e.g. x[x.domain[0].exclusive_max - n].

  • In TensorStore, out-of-bounds intervals specified by a slice result in an error. In NumPy, out-of-bounds indices specified by a slice are silently truncated.

  • To specify a sequence of indexing terms when using the syntax obj[expr] in TensorStore, expr must be a tuple. In NumPy, for compatibility with its predecessor library Numeric, if expr is a list or other non-numpy.ndarray sequence type containing at least one slice, Ellipsis, or None value, it is interpreted the same as a tuple (this behavior is deprecated in NumPy since version 1.15.0). TensorStore, in contrast, will attempt to convert any non-tuple sequence to an integer or boolean array, which results in an error if the sequence contains a slice, Ellipsis, or None value.

Vectorized indexing mode (vindex)

The expression obj.vindex[expr], where obj is any tensorstore.Indexable object and expr is a valid NumPy-style indexing expression, has a similar effect to obj[expr] except that if expr specifies any array indexing terms, the broadcasted array dimensions are unconditionally added as the first dimensions of the result domain:

>>> x = ts.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]], dtype=ts.int32)
>>> x.vindex[:, [1, 0], [1, 1]]
TensorStore({
  'array': [[4, 8], [2, 6]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},
})

This avoids the potentially-confusing behavior of the default legacy NumPy semantics, under which the broadcasted array dimensions are added inline to the result domain if none of the array indexing terms are separated by a slice, Ellipsis, or tensorstore.newaxis term.

Note

If expr does not include any array indexing terms, obj.vindex[expr] is exactly equivalent to obj[expr].

This indexing method is similar to the behavior of:

Outer indexing mode (oindex)

The expression obj.oindex[expr], where obj is any tensorstore.Indexable object and expr is a valid NumPy-style indexing expression, performs outer/orthogonal indexing. The effect is similar to obj[expr], but differs in that any integer or boolean array indexing terms are applied orthogonally:

>>> x = ts.array([[0, 1, 2], [3, 4, 5]], dtype=ts.int32)
>>> x.oindex[[0, 0, 1], [1, 2]]
TensorStore({
  'array': [[1, 2], [1, 2], [4, 5]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [3, 2], 'input_inclusive_min': [0, 0]},
})
>>> # equivalent, using boolean array
>>> x.oindex[[0, 0, 1], [False, True, True]]
TensorStore({
  'array': [[1, 2], [1, 2], [4, 5]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [3, 2], 'input_inclusive_min': [0, 0]},
})

Unlike in the default or the vindex indexing modes, the index array shapes need not be broadcast-compatible; instead, the dimensions of each index array (or the 1-d index array equivalent of a boolean array) are added to the result domain immediately after any dimensions added by the previous indexing terms:

>>> x = ts.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]], dtype=ts.int32)
>>> x.oindex[[1, 0], :, [0, 0, 1]]
TensorStore({
  'array': [[[5, 5, 6], [7, 7, 8]], [[1, 1, 2], [3, 3, 4]]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [2, 2, 3],
    'input_inclusive_min': [0, 0, 0],
  },
})

Each boolean array indexing term adds a single dimension to the result domain:

>>> x.oindex[[[True, False], [False, True]], [1, 0]]
TensorStore({
  'array': [[2, 1], [8, 7]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {'input_exclusive_max': [2, 2], 'input_inclusive_min': [0, 0]},
})

Note

If expr does not include any array indexing terms, obj.oindex[expr] is exactly equivalent to obj[expr].

This indexing method is similar to the behavior of:

Dimension expressions

Dimension expressions provide an alternative indexing mechanism to NumPy-style indexing that is more powerful and expressive and supports dimension labels (but can be more verbose):

>>> x = ts.array([[[0, 1], [2, 3], [4, 5]], [[6, 7], [8, 9], [10, 11]]],
...              dtype=ts.int32)
>>> # Label the dimensions "x", "y", "z"
>>> x = x[ts.d[:].label["x", "y", "z"]]
>>> x
TensorStore({
  'array': [[[0, 1], [2, 3], [4, 5]], [[6, 7], [8, 9], [10, 11]]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [2, 3, 2],
    'input_inclusive_min': [0, 0, 0],
    'input_labels': ['x', 'y', 'z'],
  },
})
>>> # Select the x=0 slice
>>> x[ts.d["x"][0]]
TensorStore({
  'array': [[0, 1], [2, 3], [4, 5]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [3, 2],
    'input_inclusive_min': [0, 0],
    'input_labels': ['y', 'z'],
  },
})
>>> # Select the y=1, x=0 slice
>>> x[ts.d["y", "x"][1, 0]]
TensorStore({
  'array': [2, 3],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [2],
    'input_inclusive_min': [0],
    'input_labels': ['z'],
  },
})
>>> # Transpose "x" and "z"
>>> x[ts.d["x", "z"].transpose[2, 0]]
TensorStore({
  'array': [[[0, 6], [2, 8], [4, 10]], [[1, 7], [3, 9], [5, 11]]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [2, 3, 2],
    'input_inclusive_min': [0, 0, 0],
    'input_labels': ['z', 'y', 'x'],
  },
})
>>> # Select the x=d, y=d diagonal, and transpose "d" to end
>>> x[ts.d["x", "y"].diagonal.label["d"].transpose[-1]]
TensorStore({
  'array': [[0, 8], [1, 9]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [2, 2],
    'input_inclusive_min': [0, 0],
    'input_labels': ['z', 'd'],
  },
})
>>> # Slice z=0, apply outer indexing to "x" and "y", label as "a", "b"
>>> x[ts.d["z", "x", "y"].oindex[0, [0, 1], [2, 1]].label["a", "b"]]
TensorStore({
  'array': [[4, 2], [10, 8]],
  'context': {'data_copy_concurrency': {}},
  'driver': 'array',
  'dtype': 'int32',
  'transform': {
    'input_exclusive_max': [2, 2],
    'input_inclusive_min': [0, 0],
    'input_labels': ['a', 'b'],
  },
})

The usual syntax for applying a dimension expression is: obj[ts.d[sel] op1 ... opN], where obj is any tensorstore.Indexable object, sel specifies the initial dimension selection and op1 ... opN specifies a chain of one or more operations supported by tensorstore.DimExpression (the ... in op1 ... opN is not a literal Python Ellipsis (...), but simply denotes a sequence of operation invocations).

The tensorstore.DimExpression object itself, constructed using the syntax ts.d[sel] op1 ... opN is simply a lightweight, immutable representation of the sequence of operations and their arguments, and performs only minimal validation upon construction; full validation is deferred until it is actually applied to an tensorstore.Indexable object, using the syntax obj[ts.d[sel] op1 ... opN].

Dimension selections

A dimension selection is specified using the syntax ts.d[sel], where sel is one of:

  • an integer, specifying an existing or new dimension by index (as with built-in sequence types, negative numbers specify a dimension index relative to the end);

  • a non-empty str, specifying an existing dimension by label;

  • a slice object, start:stop:step, where start, stop, and step are either integers or None, specifying a range of existing or new dimensions by index (as for built-in sequence types, negative numbers specify a dimension index relative to the end);

  • any sequence (including a tuple, list, or another tensorstore.d object) of any of the above.

The result is a tensorstore.d object, which is simply a lightweight, immutable container representing the flattened sequence of int, str, or slice objects:

>>> ts.d[0, 1, 2]
d[0,1,2]
>>> ts.d[0:1, 2, "x"]
d[0:1,2,'x']
>>> ts.d[[0, 1], [2]]
d[0,1,2]
>>> ts.d[[0, 1], ts.d[2, 3]]
d[0,1,2,3]

A str label always identifies an existing dimension, and is only compatible with operations/terms that expect an existing dimension:

>>> x = ts.IndexTransform(input_labels=['x'])
>>> x[ts.d["x"][2:3]]
Rank 1 -> 1 index space transform:
  Input domain:
    0: [2, 3) "x"
  Output index maps:
    out[0] = 0 + 1 * in[0]

An integer may identify either an existing or new dimension depending on whether it is used with a tensorstore.newaxis term:

>>> x = ts.IndexTransform(input_labels=['x', 'y'])
>>> # `1` refers to existing dimension "y"
>>> x[ts.d[1][2:3]]
Rank 2 -> 2 index space transform:
  Input domain:
    0: (-inf*, +inf*) "x"
    1: [2, 3) "y"
  Output index maps:
    out[0] = 0 + 1 * in[0]
    out[1] = 0 + 1 * in[1]
>>> # `1` refers to new singleton dimension
>>> x[ts.d[1][ts.newaxis]]
Rank 3 -> 2 index space transform:
  Input domain:
    0: (-inf*, +inf*) "x"
    1: [0*, 1*)
    2: (-inf*, +inf*) "y"
  Output index maps:
    out[0] = 0 + 1 * in[0]
    out[1] = 0 + 1 * in[2]

A negative dimension index -i is equivalent to n - i, where n is the sum of the rank of the original domain plus the number of tensorstore.newaxis terms:

>>> x = ts.IndexTransform(input_labels=['x', 'y'])
>>> # `-1` is equivalent to 1, refers to existing dimension "y"
>>> x[ts.d[-1][2:3]]
Rank 2 -> 2 index space transform:
  Input domain:
    0: (-inf*, +inf*) "x"
    1: [2, 3) "y"
  Output index maps:
    out[0] = 0 + 1 * in[0]
    out[1] = 0 + 1 * in[1]
>>> # `-1` is equivalent to 2, refers to new singleton dimension
>>> x[ts.d[-1][ts.newaxis]]
Rank 3 -> 2 index space transform:
  Input domain:
    0: (-inf*, +inf*) "x"
    1: (-inf*, +inf*) "y"
    2: [0*, 1*)
  Output index maps:
    out[0] = 0 + 1 * in[0]
    out[1] = 0 + 1 * in[1]

Likewise, a slice may identify either existing or new dimensions:

>>> x = ts.IndexTransform(input_labels=['x', 'y', 'z'])
>>> # `:2` refers to existing dimensions "x", "y"
>>> x[ts.d[:2][1:2, 3:4]]
Rank 3 -> 3 index space transform:
  Input domain:
    0: [1, 2) "x"
    1: [3, 4) "y"
    2: (-inf*, +inf*) "z"
  Output index maps:
    out[0] = 0 + 1 * in[0]
    out[1] = 0 + 1 * in[1]
    out[2] = 0 + 1 * in[2]
>>> # `:2` refers to two new singleton dimensions
>>> x[ts.d[:2][ts.newaxis, ts.newaxis]]
Rank 5 -> 3 index space transform:
  Input domain:
    0: [0*, 1*)
    1: [0*, 1*)
    2: (-inf*, +inf*) "x"
    3: (-inf*, +inf*) "y"
    4: (-inf*, +inf*) "z"
  Output index maps:
    out[0] = 0 + 1 * in[2]
    out[1] = 0 + 1 * in[3]
    out[2] = 0 + 1 * in[4]

If a tensorstore.newaxis term is mixed with a term that consumes an existing dimension, any dimension indices specified in the dimension selection (either directly or via slice objects) are with respect to an intermediate domain with any new singleton dimensions inserted but no existing dimensions consumed:

>>> x = ts.IndexTransform(input_labels=['x', 'y'])
>>> # `1` refers to new singleton dimension, `2` refers to "y"
>>> # intermediate domain is: {0: "x", 1: "", 2: "y"}
>>> x[ts.d[1, 2][ts.newaxis, 0]]
Rank 2 -> 2 index space transform:
  Input domain:
    0: (-inf*, +inf*) "x"
    1: [0*, 1*)
  Output index maps:
    out[0] = 0 + 1 * in[0]
    out[1] = 0

Dimension expression construction

A tensorstore.DimExpression that applies a given operation to an initial dimension selection dexpr = ts.d[sel] is constructed using:

  • subscript syntax dexpr[iexpr] (for NumPy-style indexing);

  • attribute syntax dexpr.diagonal for operations that take no arguments; or

  • attribute subscript syntax dexpr.label[arg].

The same syntax may also be used to chain additional operations onto an existing tensorstore.DimExpression:

>>> x = ts.IndexTransform(input_rank=0)
>>> x[ts.d[0][ts.newaxis][1:10].label['z']]
Rank 1 -> 0 index space transform:
  Input domain:
    0: [1, 10) "z"
  Output index maps:

When a tensorstore.DimExpression dexpr is applied to a tensorstore.Indexable object obj, using the syntax obj[dexpr], the following steps occur:

  1. The initial dimension selection specified in dexpr is resolved based on the domain of obj and the first operation of dexpr.

  2. The first operation specified in dexpr is applied to obj using the resolved initial dimension selection. This results in a new tensorstore.Indexable object of the same type as obj and a new dimension selection consisting of the dimensions retained from the prior dimension selection or added by the operation.

  3. Each subsequent operation, is applied, in order, to the new tensorstore.Indexable object and new dimension selection produced by each prior operation.

NumPy-style dimension expression indexing

The syntax dexpr[iexpr], dexpr.vindex[iexpr], and dexpr.oindex[iexpr] chains a NumPy-style indexing operation to an existing tensorstore.d or tensorstore.DimExpression.

The behavior is similar to that of regular NumPy-style indexing applied directly to a tensorstore.Indexable object, with the following differences:

  • The terms of the indexing expression iexpr consume dimensions in order from the dimension selection rather than starting from the first dimension of the domain, and unless an Ellipsis (...) term is specified, iexpr must include a sufficient number of indexing terms to consume the entire dimension selection.

  • tensorstore.newaxis terms are only permitted in the first operation of a dimension expression, since in subsequent operations all dimensions of the dimension selection necessarily refer to existing dimensions. Additionally, the dimension selection must specify the index of the new dimension for each tensorstore.newaxis term.

  • If iexpr is a scalar indexing expression that consists of a:

    it may be used with a dimension selection of more than one dimension, in which case iexpr is implicitly duplicated to match the number of dimensions in the dimension selection:

    >>> x = ts.IndexTransform(input_labels=["x", "y"])
    >>> # add singleton dimension to beginning and end
    >>> x[ts.d[0, -1][ts.newaxis]]
    Rank 4 -> 2 index space transform:
      Input domain:
        0: [0*, 1*)
        1: (-inf*, +inf*) "x"
        2: (-inf*, +inf*) "y"
        3: [0*, 1*)
      Output index maps:
        out[0] = 0 + 1 * in[1]
        out[1] = 0 + 1 * in[2]
    >>> # slice out square region
    >>> x[ts.d[:][0:10]]
    Rank 2 -> 2 index space transform:
      Input domain:
        0: [0, 10) "x"
        1: [0, 10) "y"
      Output index maps:
        out[0] = 0 + 1 * in[0]
        out[1] = 0 + 1 * in[1]
    
  • When using the default indexing mode, i.e. dexpr[iexpr], if more than one array indexing term is specified (even if they are consecutive), the array dimensions are always added as the first dimensions of the result domain (as if dexpr.vindex[iexpr] were specified).

  • When using outer indexing mode, i.e. dexpr.oindex[iexpr], zero-rank boolean arrays are not permitted.

Index transforms