s3
Key-Value Store driver¶
The s3
driver provides access to Amazon S3 and S3-compatible object stores.
Keys directly correspond to paths within an S3 bucket.
Warning
The s3
key-value store driver does not provide all the atomicity
guarantees required by tensorstore. On AWS, specfically, DELETE is not
atomic, which leads to race conditions. On other S3-compatible object
stores even PUT may not be atomic.
This non-atomicity can lead to unexpected behavior when writing to an
S3-backed TensorStore. For example, writing to a zarr array can in some
cases lead to a delete rather than a write (if it happens to match the fill
value), and therefore a write operation that might be atomic and safe
when writing to other key-value store implementations might be unsafe
when using s3
.
- json kvstore/s3 : object¶
Read/write access to Amazon S3-compatible object stores.
JSON specification of the key-value store.
- Optional members:¶
- path : string¶
Key prefix within the key-value store.
If the prefix is intended to correspond to a Unix-style directory path, it should end with
"/"
.
-
requester_pays : boolean =
false
¶ Permit requester-pays requests.
This option must be enabled in order for any operations to succeed if the bucket has Requester Pays enabled and the supplied credentials are not for an owner of the bucket.
- aws_region : string¶
AWS region identifier to use in signatures.
If
endpoint
is not specified, the region of thebucket
is determined automatically.
- endpoint : string¶
S3 server endpoint to use in place of the public Amazon S3 endpoints.
Must be an http or https URL.
Example
"http://localhost:1234"
- host_header : string¶
Override HTTP host header to send in requests.
May only be specified in conjunction with
endpoint
, to send a different host than specified inendpoint
. This may be useful for testing with localstack.”Example
"mybucket.s3.af-south-1.localstack.localhost.com"
- aws_credentials : ContextResource¶
Specifies or references a previously defined
Context.aws_credentials
.
- s3_request_concurrency : ContextResource¶
Specifies or references a previously defined
Context.s3_request_concurrency
.
- s3_request_retries : ContextResource¶
Specifies or references a previously defined
Context.s3_request_retries
.
- experimental_s3_rate_limiter : ContextResource¶
Specifies or references a previously defined
Context.experimental_s3_rate_limiter
.
-
data_copy_concurrency : ContextResource =
"data_copy_concurrency"
¶ Specifies or references a previously defined
Context.data_copy_concurrency
.
- json Context.s3_request_concurrency : object¶
Specifies a limit on the number of concurrent requests to S3.
- Optional members:¶
-
limit : integer[
1
, +∞) |"shared"
="shared"
¶ The maximum number of concurrent requests. If the special value of :json:”shared” is specified, a shared global limit specified by environment variable
TENSORSTORE_S3_REQUEST_CONCURRENCY
, which defaults to 32.
-
limit : integer[
- json Context.s3_request_retries : object¶
Specifies retry parameters for handling transient network errors. An exponential delay is added between consecutive retry attempts. The default values are appropriate for S3.
- json Context.experimental_s3_rate_limiter : object¶
Experimental rate limiter configuration for S3 reads and writes.
- Optional members:¶
- read_rate : number¶
The maximum rate or read and/or list calls issued per second.
- write_rate : number¶
The maximum rate of write and/or delete calls issued per second.
-
doubling_time : string =
"0"
¶ The time interval over which the initial rates scale to 2x. The cases where this setting is useful depend on details to the storage buckets.
- json Context.aws_credentials : object¶
The
type
member identifies the credentials provider. The remaining members are specific to the credentials provider.- Subtypes:¶
- json Context.aws_credentials/default : object¶
Source credentials using the default AWS credentials chain.
- Extends:¶
- Optional members:¶
-
profile : string =
"default"
¶ The profile name in the
~/.aws/credentials
file.When unset, AWS credentials also examines the
AWS_PROFILE
environment variable.
-
profile : string =
- json Context.aws_credentials/profile : object¶
Sources credentials from the AWS config and credentials files.
- Extends:¶
- Optional members:¶
-
profile : string =
"default"
¶ The profile name in the
~/.aws/credentials
file.When unset, AWS credentials also examines the
AWS_PROFILE
environment variable.
-
config_file : string =
"${HOME}/.aws/config"
¶ The path to the AWS config file.
When unset, AWS credentials also examines the
AWS_CONFIG_FILE
environment variable.
-
credentials_file : string =
"${HOME}/.aws/credentials"
¶ The path to the AWS credentials file.
When unset, AWS credentials also examines the
AWS_SHARED_CREDENTIALS_FILE
environment variable.
-
profile : string =
- json Context.aws_credentials/ecs : object¶
Sources credentials from ECS container metadata.
- Extends:¶
- Optional members:¶
- endpoint : string¶
URL used to request credentials from the ECS container metadata service.
When unset, ECS credentials are sourced from the environment.
- auth_token_file : string¶
File path containing the Authorization token to include in an ECS credentials query.
This file contains an authorization token to include in an ECS credentials query. The file will be read each time the credentials are requested.
- json Context.aws_credentials/environment : object¶
- Source credentials from the Environment variables:
AWS_ACCESS_KEY_ID
for the access key ID.AWS_SECRET_ACCESS_KEY
for the secret access key.AWS_SESSION_TOKEN
for the session token.
- Extends:¶
- json Context.aws_credentials/imds : object¶
Source credentials from the EC2 instance metadata service (IMDS).
- Extends:¶
- json KvStoreUrl/s3 : object¶
s3://
KvStore URL schemeAWS S3 key-value stores may be specified using the
s3://bucket/path
URL syntax, as supported by aws s3.Examples
URL representation
JSON representation
"s3://my-bucket"
{"driver": "s3", "bucket": "my-bucket"}
"s3://bucket/path/to/dataset"
{"driver": "s3", "bucket": "my-bucket", "path": "path/to/dataset"}
- Extends:¶
KvStoreUrl
— URL representation of a key-value store.
Authentication¶
To use the s3
driver, you can access buckets that allow public access
without credentials. Otherwise amazon credentials are required:
Credentials may be obtained from the environment. Set the
AWS_ACCESS_KEY_ID
environment variable, optionally along with theAWS_SECRET_ACCESS_KEY
environment variable and theAWS_SESSION_TOKEN
environment variable as they would be used by the aws cli.Credentials may be obtained from the default user credentials file, when found at
~/.aws/credentials
, or the file specified by the environment variableAWS_SHARED_CREDENTIALS_FILE
, along with a profile from the schema, or as indicated by theAWS_PROFILE
environment variables.Credentials may be retrieved from the EC2 Instance Metadata Service (IMDS) when it is available.
- AWS_ACCESS_KEY_ID¶
Specifies an AWS access key associated with an IAM account. See <https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html>
- AWS_SECRET_ACCESS_KEY¶
Specifies the secret key associated with the access key. This is essentially the “password” for the access key. See <https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html>
- AWS_SESSION_TOKEN¶
Specifies the session token value that is required if you are using temporary security credentials that you retrieved directly from AWS STS operations. See <https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html>
- AWS_PROFILE¶
Specifies the name of the AWS CLI profile with the credentials and options to use. This can be the name of a profile stored in a credentials or config file, or the value
default
to use the default profile.If defined, this environment variable overrides the behavior of using the profile named
[default]
in the credentials file. See <https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html>
- AWS_SHARED_CREDENTIALS_FILE¶
Specifies the location of the file that the AWS CLI uses to store access keys. The default path is
~/.aws/credentials
. See <https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html>
- AWS_CONFIG_FILE¶
Specifies the location of the file that the AWS CLI uses to store config. The default path is
~/.aws/config
. See <https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html>
- AWS_EC2_METADATA_SERVICE_ENDPOINT¶
Overrides the default EC2 Instance Metadata Service (IMDS) endpoint of
http://169.254.169.254
. This must be a valid uri, and should respond to the AWS IMDS api endpoints. See <https://docs.aws.amazon.com/sdkref/latest/guide/feature-imds-credentials.html>
- TENSORSTORE_S3_REQUEST_CONCURRENCY¶
Specifies the concurrency level used by the shared Context
Context.s3_request_concurrency
resource. Defaults to 32.
- TENSORSTORE_S3_USE_CONDITIONAL_WRITE¶
Enables conditional writes for the S3 driver. This is experimental and may be changed or removed in the future.