Schema¶
The schema of a TensorStore specifies key properties of the format in a uniform way that is independent of where and how the data is actually stored. When creating a TensorStore, schema constraints and preferences may be specified; the driver combines these constraints with any driver-specific constraints/defaults to choose a suitable schema automatically. When opening an existing TensorStore, its schema is validated against any constraints that are specified.
- json Schema : object¶
- Optional members:¶
-
rank : integer[
0
,32
]¶ Number of dimensions.
The rank is always a hard constraint.
- dtype : dtype¶
Specifies the data type of the TensorStore.
The data type is always a hard constraint.
- domain : IndexDomain¶
Domain of the TensorStore, including bounds and optional dimension labels.
The domain is always a hard constraint, except that a labeled dimension is allowed to match an unlabeled dimension, and an implicit, infinite bound is considered an unspecified bound and does not impose any constraints. When merging two schema constraint objects that both specify domains, any dimensions that are labeled in both domains must have the same label, and any explicit or finite bounds specified in both domains must be equal. If a dimension is labeled in one domain and unlabeled in the other, the label is retained. If a bound is implicit and infinite in one domain, the bound from the other domain is used.
- chunk_layout : ChunkLayout¶
Data storage layout constraints.
The rank of the chunk layout must match the
rank
of the schema. When merging schema constraints objects, the chunk layout constraints are merged recursively.
- codec : Codec¶
Driver-specific compression and other parameters for encoding/decoding data. When merging schema constraints objects, the codec constraints are merged recursively.
- fill_value¶
Fill value to use for missing data.
Must be broadcast-compatible with the
domain
.
-
dimension_units : array of Unit |
null
¶ Physical units of each dimension.
Specifies the physical quantity corresponding to an increment of 1 index along each dimension, i.e. the resolution. The length must match the
rank
of the schema. Specifyingnull
for a dimension indicates that the unit is unknown.Example class: example
["4nm", "4nm", null]
specifies that the voxel size is 4nm along the first two dimensions, and unknown along the third dimension.Note
null
is not equivalent to specifying""
(or equivalently,[1, ""]
), which indicates a dimensionless unit of1
.
-
rank : integer[
Chunk layout¶
For chunked storage formats, the data storage layout can be represented in a driver-independent way as a chunk layout.
A chunk layout specifies a hierarchical regular grid with up to three levels:
The write level, the top-most level, specifies the grid to which writes should be aligned. Writes of individual chunkss at this level may be performed without amplification. For the zarr Driver, n5 Driver and the neuroglancer_precomputed Driver using the unsharded format, the write level is also the only level; each write chunk corresponds to a single key in the underlying Key-Value Storage Layer. For the neuroglancer_precomputed Driver using the sharded format, each write chunk corresponds to an entire shard.
The read level evenly subdivides write chunks by an additional regular grid. Reads of individual chunks at this level may be performed without amplification. Every write chunk boundary must be aligned to a read chunk boundary. If reads and writes may be performed at the same granularity, such as with the zarr Driver, n5 Driver, and the neuroglancer_precomputed Driver using the unsharded format, there is no additional read grid; a read chunk is the same size as a write chunk. For the neuroglancer_precomputed Driver using the sharded format, each read chunk corresponds to a base chunk as defined by the format.
The codec level further subdivides the read level into codec chunks. For formats that make use of it, the codec chunk shape may affect the compression rate. For the neuroglancer_precomputed Driver when using the compressed segmentation encoding, the codec chunk shape specifies the compressed segmentation block shape. The codec block shape does not necessarily evenly subdivide the read chunk shape. (The precise offset of the codec chunk grid relative to the read chunk grid is not specified by the chunk layout.)
When creating a new TensorStore, constraints on the data storage layout can be specified without specifying the precise layout explicitly.
- json ChunkLayout : object¶
- Optional members:¶
-
rank : integer[
0
,32
]¶ Number of dimensions.
The rank is always a hard constraint. It is redundant to specify the rank if any other field that implicitly specifies the rank is included.
-
grid_origin : array of integer |
null
¶ Specifies hard constraints on the origin of the chunk grid.
The length must equal the rank of the index space. Each element constrains the grid origin for the corresponding dimension. A value of
null
(or, equivalently,-9223372036854775808
) indicates no constraint.
-
grid_origin_soft_constraint : array of integer |
null
¶ Specifies preferred values for the origin of the chunk grid rather than hard constraints.
If a non-
null
value is specified for a given dimension in bothgrid_origin_soft_constraint
andgrid_origin
, the value ingrid_origin
takes precedence.
- inner_order : array of integer¶
Permutation specifying the element storage order within the innermost chunks.
This must be a permutation of
[0, 1, ..., rank-1]
. Lexicographic order (i.e. C order/row-major order) is specified as[0, 1, ..., rank-1]
, while colexicographic order (i.e. Fortran order/column-major order) is specified as[rank-1, ..., 1, 0]
.
- inner_order_soft_constraint : array of integer¶
Specifies a preferred value for
inner_order
rather than a hard constraint. Ifinner_order
is also specified, it takes precedence.
- write_chunk : ChunkLayout/Grid¶
Constraints on the chunk grid over which writes may be efficiently partitioned.
- read_chunk : ChunkLayout/Grid¶
Constraints on the chunk grid over which reads may be efficiently partitioned.
- codec_chunk : ChunkLayout/Grid¶
Constraints on the chunk grid used by the codec, if applicable.
- chunk : ChunkLayout/Grid¶
Combined constraints on write/read/codec chunks.
If
aspect_ratio
is specified, it applies towrite_chunk
,read_chunk
, andcodec_chunk
. Ifaspect_ratio_soft_constraint
is specified, it also applies towrite_chunk
,read_chunk
, andcodec_chunk
, but with lower precedence than any write/read/codec-specific value that is also specified.If
shape
orelements
is specified, it applies towrite_chunk
andread_chunk
(but notcodec_chunk
). Ifshape_soft_constraint
orelements_soft_constraint
is specified, it also applies towrite_chunk
andread_chunk
, but with lower precedence than any write/read-specific value that is also specified.
-
rank : integer[
- json ChunkLayout/Grid : object¶
Constraints on the write/read/codec chunk grids.
When creating a new TensorStore, the chunk shape can be specified directly using the
shape
andshape_soft_constraint
members, or indirectly by specifying theaspect_ratio
and target number ofelements
.When opening an existing TensorStore, the preferences indicated by
shape_soft_constraint
,aspect_ratio
,aspect_ratio_soft_constraint
,elements
, andelements_soft_constraint
are ignored; onlyshape
serves as a constraint.- Optional members:¶
-
shape : array of integer[
0
, +∞) |-1
|null
¶ Hard constraints on the chunk size for each dimension.
The length must equal the rank of the index space. Each element constrains the chunk size for the corresponding dimension, and must be a non-negative integer. The special value of
0
(or, equivalently,null
)for a given dimension indicates no constraint. The special value of-1
for a given dimension indicates that the chunk size should equal the full extent of the domain, and is always treated as a soft constraint.
-
shape_soft_constraint : array of integer[
0
, +∞) |-1
|null
¶ Preferred chunk sizes for each dimension.
If a non-zero, non-
null
size for a given dimension is specified in bothshape
andshape_soft_constraint
,shape
takes precedence.
-
aspect_ratio : array of number[
0
, +∞) |null
¶ Aspect ratio of the chunk shape.
Specifies the relative chunk size along each dimension. The special value of
0
(or, equivalently,null
) indicates no preference (which results in the default aspect ratio of1
if not otherwise specified). The aspect ratio preference is only taken into account if the chunk size along a given dimension is not specified byshape
orshape_soft_constraint
, or otherwise constrained. For example, anaspect_ratio
of[1, 1.5, 1.5]
indicates that the chunk size along dimensions 1 and 2 should be 1.5 times the chunk size along dimension 0. If the target number ofelements
is486000
, then the resultant chunk size will be[60, 90, 90]
(assuming it is not otherwise constrained).
-
aspect_ratio_soft_constraint : array of number[
0
, +∞) |null
¶ Soft constraint on aspect ratio, lower precedence than
aspect_ratio
.
-
elements : integer[
1
, +∞) |null
¶ Preferred number of elements per chunk.
Used in conjunction with
aspect_ratio
to determine the chunk size for dimensions that are not otherwise constrained. The special value ofnull
indicates no preference, in which case a driver-specific default may be used.
-
shape : array of integer[
Codec¶
- json Codec : object¶
Codecs are specified by a required
driver
property that identifies the driver. All other properties are driver-specific. Refer to the driver documentation for the supported codec drivers and the driver-specific properties.- Subtypes:¶
driver/neuroglancer_precomputed/Codec
— Neuroglancer Precomputed Codec
- Required members:¶
- driver : string¶
Driver identifier
Specifies the TensorStore driver to which this codec is applicable.
Example
{ "driver": "zarr", "compressor": {"id": "blosc", "cname": "lz4", "clevel": null, "5": null, "shuffle": 1}, "filters": null }
Example
{ "driver": "n5", "compression": {"type": "gzip", "level": "6", "useZlib": false} }
Dimension units¶
- json Unit : [number, string] | string | number¶
Specifies a physical quantity/unit.
The quantity is specified as the combination of:
A numerical
multiplier
, represented as a double-precision floating-point number. A multiplier of1
may be used to indicate a quanity equal to a single base unit.A
base_unit
, represented as a string. An empty string may be used to indicate a dimensionless quantity. In general, TensorStore does not interpret the base unit string; some drivers impose additional constraints on the base unit, while other drivers may store the specified unit directly. It is recommended to follow the udunits2 syntax unless there is a specific need to deviate.
Three JSON formats are supported:
The canonical format, as a two-element
[multiplier, base_unit]
array. This format is always used by TensorStore when returning the JSON representation of a unit.A single string. If the string contains a leading number, it is parsed as the
multiplier
and the remaining portion, after stripping leading and trailing whitespace, is used as thebase_unit
. If there is no leading number, themultiplier
is 1 and the entire string, after stripping leading and trailing whitespace, is used as thebase_unit
.A single number, to indicate a dimension-less unit with the specified multiplier.
Example class: example
"4.5e-9m"
,"4.5e-9 m"
, and[4.5e-9, "m"]
are all equivalent."1nm"
,"nm"
, and[1, "nm"]
are all equivalent.5
,"5"
, and[5, ""]
are all equivalent.