# Index space¶

TensorStore defines an *index space* as an \(n\)-dimensional space, where \(n\) is the *rank* of the space and each index is an integer in the closed interval \([-(2^{62}-2), +(2^{62}-2)]\). The special values \(\pm (2^{62}-1)\) are not valid indices but represent bounds of \(\pm \infty\).

Note

The limits of \(\pm (2^{62}-2)\) permit \(\pm (2^{62}-1)\) to be reserved to represent \(\pm \infty\) and still allow the difference between any two bounds to be represented as a signed 64-bit integer.

TensorStore currently supports ranks \(n\) less than or equal to 32, which is the same constraint imposed by NumPy.

Each of the \(n\) dimensions may optionally be identified by a unique string *label*, such as `"x"`

, `"y"`

, or `"z"`

. Unlabeled dimensions are indicated by an empty string as the label.

## Index domain¶

An *index domain* is a rectangular region of an index space.

The inclusive lower bound of each dimensions is specified as an integer in \([-(2^{62}-1), 2^{62}-2]\), where the special value of \(-2^{62}-1\) indicates the dimension is unbounded below.

The inclusive upper bound of each dimension is specified as an integer in \([-(2^{62}-2), 2^{62}-1]\), where the special value of \(2^{62}-1\) indicates the dimension is unbounded above.

The lower and upper bound of each dimension is marked either

*explicit*or*implicit*(represented as two additional bits per dimension). Explicit bounds indicate hard constraints on the valid indices for that dimension. Implicit bounds may be used for resizable dimensions of arrays: the implicit bound indicates the bound as of a certain time, but the bound may change if a resizable operation is performed, and indexing operations are not constrained by implicit bounds. Implicit bounds of \(\pm \infty\) are also used to indicate unknown or unspecified bounds.

Note

Specifying infinite bounds \((-\infty, +\infty)\) for a dimension is similar to specifying the finite bounds \([-(2^{62}-2, +(2^{62}-2)]\), in that in both cases the domain contains the full range of possible indices. The difference is that translating a dimension with infinite bounds has no effect, while translating a dimension with bounds \([-(2^{62}-2, +(2^{62}-2)]\) by any non-zero offset results in an out-of-bounds error.

*json*`IndexDomain`

: object¶Index domains may be serialized to/from JSON using the following schema.

If neither

`inclusive_min`

nor`shape`

is specified, all dimensions receive an implicit lower bound of \(-\infty\). If`shape`

is specified but`inclusive_min`

is not specified, all dimensions receive an explicit lower bound of 0.At most one of

`exclusive_max`

,`inclusive_max`

, and`shape`

may be specified. If none are specified, all dimensions receive an implicit upper bound of \(+\infty\).Optional members

`rank`

: integer_{[0, 32]}¶Number of dimensions.

The rank must be specified either directly, or implicitly by the number of dimensions specified for

`inclusive_min`

,`inclusive_max`

,`exclusive_max`

,`shape`

, or`labels`

.

`inclusive_min`

: array*of*integer | [integer]¶Inclusive lower bounds of the domain.

Length must equal the

`rank`

. Bounds specified as

indicate normal, explicit bounds, while bounds specified as*n*`[`

indicate implicit bounds. For example,*n*]`[1, [2]]`

specifies an explicit bound of \(1 \leq x\) for the first dimension and an implicit bound of \(2 \leq x\) for the second dimension.

`exclusive_max`

: array*of*integer | [integer]¶Exclusive upper bounds of the domain.

Length must equal the

`rank`

. As for`inclusive_min`

, bounds specified as

indicate normal, explicit bounds, while bounds specified as*n*`[`

indicate implicit bounds. For example,*n*]`[5, [7]]`

specifies an explicit bound of \(x < 5\) for the first dimension and an implicit bound of \(x < 7\) for the second dimension.

`inclusive_max`

: array*of*integer | [integer]¶Inclusive upper bounds of the domain.

Length must equal the

`rank`

. As for`inclusive_min`

, bounds specified as

indicate normal, explicit bounds, while bounds specified as*n*`[`

indicate implicit bounds. For example,*n*]`[5, [7]]`

specifies an explicit bound of \(x \leq 5\) for the first dimension and an implicit bound of \(x \leq 7\) for the second dimension.

`shape`

: array*of*integer | [integer]¶Extent of each dimension of the domain.

Length must equal the

`rank`

. As for`inclusive_min`

, bounds specified as

indicate normal, explicit bounds, while bounds specified as*n*`[`

indicate implicit bounds. For example, assuming an*n*]`inclusive_min`

of`[1, 2]`

, an`shape`

of`[5, [7]]`

specifies an explicit bound of \(x < 6\) for the first dimension and an implicit bound of \(x < 9\) for the second dimension.

`labels`

: array*of*string¶Dimension labels for each dimension.

Length must equal the

`rank`

. An empty string indicates an unlabeled dimension. Non-empty strings must not occur more than once. By default, all dimensions are unlabeled.

## Index transform¶

An *index transform* from rank \(m\) to rank \(n\) maps every \(m\)-dimensional index vector in a rank-\(m\) index domain to an \(n\)-dimensional index vector in the output space.

It is defined by its input index domain and \(n\) *output index maps*, one for each dimension \(j\) of the output space, each of one of the following three forms:

constant | \[\mathtt{output}[j] = \mathtt{offset},\] where \(\mathtt{offset}\) is an arbitrary 64-bit integer. |

single input dimension | \[\mathtt{output}[j] = \mathtt{offset} + \mathtt{stride} \cdot \mathtt{input}[\mathtt{input\_dimension}],\] where \(\mathtt{offset}\) and \(\mathtt{stride}\) are arbitrary 64-bit integers and \(\mathtt{input\_dimension}\) is in the range \([0, m)\). |

index array | \[\mathtt{output}[j] = \mathtt{offset} + \mathtt{stride} \cdot \mathtt{index\_array}[\mathtt{input}],\] where \(\mathtt{offset}\) and \(\mathtt{stride}\) are arbitrary 64-bit integers and \(\mathtt{index\_array}\) is an \(n\)-dimensional array of 64-bit integers indexed by a subset of the dimensions of the input index domain with explicit lower and upper bounds, stored as a strided array in memory. (The dimensions by which it is indexed are indicated by non-zero strides.) |

TensorStore uses this normalized index transform representation to represent any composition of indexing operations. This representation is complete, since *any* mapping can be represented as an index array, but in many cases just the more efficient constant and single input dimension maps suffice.

In most cases, two index transforms can be composed with low cost that is independent of the bounds specified in the domains: single input dimension maps may be composed with any other map without needing to copying any index arrays. Only when composing two index array maps is it necessary to write a new index array, and in some cases this can result in index array with a larger representation size.

The following table describes the relationship between the input and output spaces of a transform, and the source and target arrays/TensorStores for read and write operations:

Operation | Input domain | Output range |
---|---|---|

Read | Corresponds to the target array. | Indicates the positions of the underlying source TensorStore that are accessed. |

Write | Corresponds to the source array. | Indicates the positions of the underlying target TensorStore that are modified. |

*json*`IndexTransform`

: object¶Index transforms may be serialized to/from JSON using the following schema.

The input domain is specified by

`input_rank`

,`input_inclusive_min`

,`input_exclusive_max`

,`input_inclusive_max`

,`input_shape`

, and`input_labels`

, while the output index maps are specified by`output`

.If neither

`input_inclusive_min`

nor`input_shape`

is specified, all dimensions receive an implicit lower bound of \(-\infty\). If`input_shape`

is specified but`input_inclusive_min`

is not specified, all dimensions receive an explicit lower bound of 0.At most one of

`input_exclusive_max`

,`input_inclusive_max`

, and`input_shape`

may be specified. If none are specified, all dimensions receive an implicit upper bound of \(+\infty\).Optional members

`input_rank`

: integer_{[0, 32]}¶Number of input dimensions.

The input rank must be specified either directly, or implicitly by the number of dimensions specified for

`input_inclusive_min`

,`input_inclusive_max`

,`input_exclusive_max`

,`input_shape`

, or`input_labels`

.

`input_inclusive_min`

: array*of*integer | [integer]¶Inclusive lower bounds of the input domain.

Length must equal the

`input_rank`

. Bounds specified as

indicate normal, explicit bounds, while bounds specified as*n*`[`

indicate implicit bounds. For example,*n*]`[1, [2]]`

specifies an explicit bound of \(1 \leq x\) for the first dimension and an implicit bound of \(2 \leq x\) for the second dimension.

`input_exclusive_max`

: array*of*integer | [integer]¶Exclusive upper bounds of the input domain.

Length must equal the

`input_rank`

. As for`input_inclusive_min`

, bounds specified as

indicate normal, explicit bounds, while bounds specified as*n*`[`

indicate implicit bounds. For example,*n*]`[5, [7]]`

specifies an explicit bound of \(x < 5\) for the first dimension and an implicit bound of \(x < 7\) for the second dimension.

`input_inclusive_max`

: array*of*integer | [integer]¶Inclusive upper bounds of the input domain.

Length must equal the

`input_rank`

. As for`input_inclusive_min`

, bounds specified as

indicate normal, explicit bounds, while bounds specified as*n*`[`

indicate implicit bounds. For example,*n*]`[5, [7]]`

specifies an explicit bound of \(x \leq 5\) for the first dimension and an implicit bound of \(x \leq 7\) for the second dimension.

`input_shape`

: array*of*integer | [integer]¶Extent of each dimension of the input domain.

Length must equal the

`input_rank`

. As for`input_inclusive_min`

, bounds specified as

indicate normal, explicit bounds, while bounds specified as*n*`[`

indicate implicit bounds. For example, assuming an*n*]`input_inclusive_min`

of`[1, 2]`

, an`input_shape`

of`[5, [7]]`

specifies an explicit bound of \(x < 6\) for the first dimension and an implicit bound of \(x < 9\) for the second dimension.

`input_labels`

: array*of*string¶Dimension labels for each input domain dimension.

Length must equal the

`input_rank`

. An empty string indicates an unlabeled dimension. Non-empty strings must not occur more than once. By default, all dimensions are unlabeled.

`output`

: array*of*OutputIndexMap¶Specifies the output index map for each output dimension.

If not specified, defaults to an identity transform over the input domain. Otherwise, the length determines the output rank of the transform.

*json*`OutputIndexMap`

: object¶Specifies a transform from an input space to a single output index.

Logically, an output index map is a function that maps

`n`

-dimensional input index vectors to a single output index, using one of the supported output index methods. This is used to represent the`IndexTransform.output`

mapping for each output dimension of an`IndexTransform`

.Optional members

`offset`

: integer¶Specifies an offset for this output dimension. If neither

`input_dimension`

nor`index_array`

is specified, this specifies the**constant**value to which this output dimension maps.

`stride`

: integer¶Multiplier for the input index specified by

`input_dimension`

or the index array value specified by`index_array`

.Only valid to specify in conjunction with

`input_dimension`

or`index_array`

.

`input_dimension`

: integer_{[0, +∞)}¶If present, indicates that this output dimension uses a

**single input dimension**map with the specified input dimension. Must not be specified in conjunction with`index_array`

.

`index_array`

: array | integer¶If present, indicates that this output dimension uses an

**index array**map, with the index array specified as a nested list of rank equal to the`input_rank`

.If the

`input_rank`

is 0, must be a numeric value.

`index_array_bounds`

: IndexInterval =`["-inf", "+inf"]`

¶If present, specifies constraints on the values within

`index_array`

(which must also be specified). If`index_array`

contains an out-of-bounds index, an error may not be returned immediately but will be returned if the corresponding position within the domain is accessed. If the indices in`index_array`

have already been validated, this need not be specified. This allows transforms containing out-of-bounds index array indices to correctly round trip through JSON, but normally need not be specified manually.Example

`[5, 100]`

Example

`["-inf", 10]`

Example

`[-20, "+inf"]`

*json*`IndexInterval`

: [integer |`"-inf"`

, integer |`"+inf"`

]¶Specifies a closed interval of integer index values.

## Alignment and broadcasting¶

Many operations in TensorStore involving two index domains, such as read, write, and copy operations, automatically *align* the `source`

domain to the `target`

domain.

The following alignment methods are supported (by default, all alignment methods are used):

- permute
Source dimensions are permuted based on their labels in order to align the source domain to the target domain.

- translate
Source dimensions are translated in order to align the source domain to the target.

- broadcast
Source dimensions of size 1 do not have to match a target dimension, and not all target dimensions must match a source dimension.

Alignment is performed based on the following rules:

First, a subset of the `source`

dimensions are matched to a subset of the `target`

dimensions, according to one of two cases:

M1 | At least one of |

M2 | Both |

The matching is then validated as follows:

V1 | For each match between a dimension \(i\) of |

V2 | If the |

V3 | For every unmatched dimension \(i\) of |

V4 | If the |

If matching succeeds, a new `alignment`

transform with an (input) domain equal to `target`

and an output rank equal to \(\mathtt{source\_rank}\) is computed as follows:

A1 | For each dimension \(j\) of |

A2 | For every unmatched dimension \(i\) of |

The `alignment`

transform maps `target`

positions to corresponding `source`

positions; for example, when copying, each position of the `target`

domain is assigned the value at the corresponding position of the `source`

domain. If the `broadcast`

alignment method is used, the transform may map the same `source`

position to multiple `target`

positions.

Examples:

All unlabeled dimensions

source:

`[3, 7), [5, 6), [4, 10)`

target:

`[2, 6), [0, 4), [6, 12)`

alignment: rank \(3 \rightarrow 3\), with:

\[\begin{split}\mathrm{source}[0] &= \mathrm{target}[0] + 1 \\ \mathrm{source}[1] &= 5 \\ \mathrm{source}[2] &= \mathrm{target}[2] - 2\end{split}\]

All labeled dimensions

source:

`"x": [3, 7), "y": [5, 6), "z": [4, 10)`

target:

`"z": [6, 12), "x": [4, 8), "y": [0, 4)`

alignment: rank \(3 \rightarrow 3\), with:

\[\begin{split}\mathrm{source}[0] &= \mathrm{target}[1] - 1 \\ \mathrm{source}[1] &= 5 \\ \mathrm{source}[2] &= \mathrm{target}[0] - 2\end{split}\]

Partially labeled dimensions

source:

`"x": [3, 7), "y": [5, 6), "": [4, 10)`

target:

`"": [0, 10) "": [6, 12), "x": [4, 8), "y": [0, 4)`

alignment: rank \(4 \rightarrow 3\), with:

\[\begin{split}\mathrm{source}[0] &= \mathrm{target}[2] - 1 \\ \mathrm{source}[1] &= 5 \\ \mathrm{source}[2] &= \mathrm{target}[1] - 2\end{split}\]

Mismatched labeled dimensions

source:

`"x": [3, 7), "y": [5, 6), "z": [4, 10)`

target:

`"z": [6, 12), "w": [4, 8), "y": [0, 4)`

ERROR: Unmatched source dimension 0

`{"x": [3, 7)}`

does not have a size of 1

Note

The alignment behavior supported by TensorStore is fully compatible with `NumPy broadcasting`

but additionally is extended to support non-zero origins and labeled dimensions.