neuroglancer_uint64_sharded Key-Value Store driver

The neuroglancer_uint64_sharded driver implements support for the Neuroglancer Precomputed sharded format on top of a base key-value store.

Within the key-value store interface, which uses strings as keys, the uint64 keys are encoded as 8-byte big endian values.

json kvstore/neuroglancer_uint64_sharded : object

Read/write adapter for the Neuroglancer Precomputed sharded format.

JSON specification of the key-value store.

Extends:
  • KvStore — Key-value store specification.

Required members:
driver : "neuroglancer_uint64_sharded"
base : KvStore

Underlying key-value store.

metadata : kvstore/neuroglancer_uint64_sharded/ShardingSpec

Specifies the sharding format.

Optional members:
path : string

Key prefix within the key-value store.

If the prefix is intended to correspond to a Unix-style directory path, it should end with "/".

context : Context

Specifies context resources that augment/override the parent context.

cache_pool : ContextResource = "cache_pool"

Specifies or references a previously defined Context.cache_pool. It is normally more convenient to specify a default cache_pool in the context.

Important

It is very helpful to specify a cache pool with a non-zero total_bytes_limit value. Otherwise, every read operation will require 2 additional reads, to read the shard index and the minishard index.

data_copy_concurrency : ContextResource = "data_copy_concurrency"

Specifies or references a previously defined Context.data_copy_concurrency. It is normally more convenient to specify a default data_copy_concurrency in the context.

json kvstore/neuroglancer_uint64_sharded/ShardingSpec : object

Sharding metadata

Specifies the sharded format within the kvstore/neuroglancer_uint64_sharded.metadata and driver/neuroglancer_precomputed.scale_metadata properties.

Required members:
@type : "neuroglancer_uint64_sharded_v1"
preshift_bits : integer[0, 64]

Number of low-order bits of the chunk ID that do not contribute to the hashed chunk ID.

hash : "identity" | "murmurhash3_x86_128"

Specifies the hash function used to map chunk IDs to shards.

minishard_bits : integer[0, 64]

Number of bits of the hashed chunk ID that determine the minishard number.

The number of minishards within each shard is equal to \(2^{\mathrm{minishard\_bits}}\). The minishard number is equal to bits [0, minishard_bits) of the hashed chunk id.

shard_bits : integer[0, 64]

Number of bits of the hashed chunk ID that determine the shard number.

The number of shards is equal to \(2^{\mathrm{shard\_bits}}\). The shard number is equal to bits [minishard_bits, minishard_bits+shard_bits) of the hashed chunk ID.

Optional members:
minishard_index_encoding : "raw" | "gzip" = "raw"

Specifies the encoding of the minishard index.

Normally "gzip" is a good choice.

data_encoding : "raw" | "gzip" = "raw"

Specifies the encoding of the data chunks.

Normally "gzip" is a good choice, unless the volume uses "jpeg" encoding.

Example JSON specifications

Example: Opening with identity hash and 1GB cache
{
  "driver": "neuroglancer_uint64_sharded",
  "kvstore": "gs://my-bucket/path/to/sharded/data/",
  "metadata": {
    "@type": "neuroglancer_uint64_sharded_v1",
    "hash": "identity",
    "preshift_bits": 1,
    "minishard_bits": 3,
    "shard_bits": 3,
    "data_encoding": "raw",
    "minishard_index_encoding": "gzip",
  },
  "context": {
    "cache_pool": {"total_bytes_limit": 1000000000}
  }
}
Example: Opening with murmurhash3_x86_128 hash and 1GB cache
{
  "driver": "neuroglancer_uint64_sharded",
  "kvstore": "gs://my-bucket/path/to/sharded/data/",
  "metadata": {
    "@type": "neuroglancer_uint64_sharded_v1",
    "hash": "murmurhash3_x86_128",
    "preshift_bits": 0,
    "minishard_bits": 3,
    "shard_bits": 3,
    "data_encoding": "raw",
    "minishard_index_encoding": "gzip",
  },
  "context": {
    "cache_pool": {"total_bytes_limit": 1000000000}
  }
}

Limitations

It is strongly recommended to use a transaction when writing, and group writes by shard (one transaction per shard). Otherwise, there may be significant write amplification due to repeatedly re-writing the entire shard.