TileDB C++ API Reference¶

Context¶

class Context¶

A TileDB context wraps a TileDB storage manager “instance.” Most objects and functions will require a Context.

Internal error handling is also defined by the Context; the default error handler throws a TileDBError with a specific message.

Example:

tiledb::Context ctx;
// Use ctx when creating other objects:
tiledb::ArraySchema schema(ctx, TILEDB_SPARSE);

// Set a custom error handler:
ctx.set_error_handler([](const std::string &msg) {
    std::cerr << msg << std::endl;
});

Public Types

enum class DataProtocol : uint8_t¶

Data protocol version enum.

Values:

enumerator v2¶: Data protocol v2 (legacy)

enumerator v3¶: Data protocol v3 (TileDB 3.0+)

Public Functions

inline Context()¶

Constructor. Creates a TileDB Context with default configuration.

Throws:: TileDBError – if construction fails

inline explicit Context(const Config &config)¶

Constructor. Creates a TileDB context with the given configuration.

Throws:: TileDBError – if construction fails

inline Context(tiledb_ctx_t *ctx, bool own = true)¶

Constructor. Creates a TileDB context from the given pointer.

Parameters:: own=true – If false, disables underlying cleanup upon destruction.
Throws:: TileDBError – if construction fails

inline void handle_error(int rc) const¶

Error handler for the TileDB C API calls. Throws an exception in case of error.

Parameters:: rc – If != TILEDB_OK, calls error handler

inline std::string get_last_error_message() const noexcept¶

Get the message of the last error that occurred.

Returns:: The last error message

inline std::shared_ptr<tiledb_ctx_t> ptr() const¶: Returns the C TileDB context object.

inline Context &set_error_handler(const std::function<void(const std::string&)> &fn)¶

Sets the error handler callback. If none is set, the default_error_handler is used. The callback accepts an error message.

Parameters:: fn – Error handler callback function
Returns:: Reference to this Context

inline Config config() const¶: Returns a copy of the configuration of the context.

inline bool is_supported_fs(tiledb_filesystem_t fs) const¶

Return true if the given filesystem backend is supported.

Example:

tiledb::Context ctx;
bool s3_supported = ctx.is_supported_fs(TILEDB_S3);

Parameters:: fs – Filesystem to check

inline void cancel_tasks() const¶: Cancels all background or async tasks associated with this context.

inline void set_tag(const std::string &key, const std::string &value)¶: Sets a string/string KV tag on the context.

inline DataProtocol data_protocol(const std::string &uri) const¶

Returns the data protocol version for the given URI.

Example:

tiledb::Context ctx;
auto data_protocol = ctx.data_protocol("tiledb://namespace/array");

Parameters:: uri – The URI to check.
Returns:: The data protocol version.

inline std::string stats()¶: Returns a JSON-formatted string of the stats.

Public Static Functions

static inline void default_error_handler(const std::string &msg)¶

The default error handler callback.

Throws:: TileDBError – with the error message

Config¶

class Config¶

Carries configuration parameters for a context.

Example:

Config conf;
conf["vfs.s3.region"] = "us-east-1a";
conf["vfs.s3.use_virtual_addressing"] = "true";
Context ctx(conf);
// array operations with ctx

Public Functions

inline explicit Config(const std::string &filename)¶

Constructor that takes as input a filename (URI) that stores the config parameters. The file must have the following (text) format:

{parameter} {value}

Anything following a # character is considered a comment and, thus, is ignored.

See Config::set for the various TileDB config parameters and allowed values.

Parameters:: filename – The name of the file where the parameters will be read from.

inline explicit Config(tiledb_config_t **config)¶: Constructor from a C config object.

inline explicit Config(const std::map<std::string, std::string> &config)¶

Constructor that takes as input a STL map that stores the config parameters

Parameters:: config –

inline explicit Config(const std::unordered_map<std::string, std::string> &config)¶

Constructor that takes as input a STL unordered_map that stores the config parameters

Parameters:: config –

inline void save_to_file(const std::string filename)¶: Saves the config parameters to a (local) text file.

inline bool operator==(const Config &rhs) const¶: Compares configs for equality.

inline bool operator!=(const Config &rhs) const¶: Compares configs for inequality.

inline std::shared_ptr<tiledb_config_t> ptr() const¶: Returns the pointer to the TileDB C config object.

inline Config &set(const std::string &param, const std::string &value)¶

Sets a config parameter.

sm.allow_separate_attribute_writes Experimental Allow separate attribute write queries.Default: false
sm.allow_updates_experimental Experimental Allow update queries. Experimental for testing purposes, do not use.Default: false
sm.dedup_coords If true, cells with duplicate coordinates will be removed during sparse fragment writes. Note that ties during deduplication are broken arbitrarily. Also note that this check means that it will take longer to perform the write operation. Default: false
sm.check_coord_dups This is applicable only if sm.dedup_coords is false. If true, an error will be thrown if there are cells with duplicate coordinates during sparse fragmnet writes. If false and there are duplicates, the duplicates will be written without errors. Note that this check is much ligher weight than the coordinate deduplication check enabled by sm.dedup_coords. Default: true
sm.check_coord_oob If true, an error will be thrown if there are cells with coordinates lying outside the domain during sparse fragment writes. Default: true
sm.read_range_oob If error, this will check ranges for read with out-of-bounds on the dimension domain’s. If warn, the ranges will be capped at the dimension’s domain and a warning logged. Default: warn
sm.check_global_order Checks if the coordinates obey the global array order. Applicable only to sparse writes in global order. Default: true
sm.merge_overlapping_ranges_experimental If true, merge overlapping Subarray ranges. Else, overlapping ranges will not be merged and multiplicities will be returned. Experimental for testing purposes, do not use.Default: true
sm.enable_signal_handlers Determines whether or not TileDB will install signal handlers. Default: true
sm.compute_concurrency_level Upper-bound on number of threads to allocate for compute-bound tasks. Default*: # cores
Upper-bound on number of threads to allocate for IO-bound tasks. **Default*: # cores
The vacuuming mode, one of (remove only consolidated commit files), (remove only consolidated fragments), (remove only consolidated fragment metadata), (remove only consolidated array metadata files), or (remove only consolidate group metadata only). **Default: fragments
sm.consolidation.mode The consolidation mode, one of commits (consolidate all commit files), fragments (consolidate all fragments), fragment_meta (consolidate only fragment metadata footers to a single file), array_meta (consolidate array metadata only), or group_meta (consolidate group metadata only). Default: “fragments”
sm.consolidation.amplification The factor by which the size of the dense fragment resulting from consolidating a set of fragments (containing at least one dense fragment) can be amplified. This is important when the union of the non-empty domains of the fragments to be consolidated have a lot of empty cells, which the consolidated fragment will have to fill with the special fill value (since the resulting fragment is dense). Default: 1.0
sm.consolidation.buffer_size Deprecated The size (in bytes) of the attribute buffers used during consolidation. Default: 50,000,000
sm.consolidation.max_fragment_size Experimental The size (in bytes) of the maximum on-disk fragment size that will be created by consolidation. When it is reached, consolidation will continue the operation in a new fragment. The result will be a multiple fragments, but with seperate MBRs.
sm.consolidation.steps The number of consolidation steps to be performed when executing the consolidation algorithm.Default: UINT32_MAX
sm.consolidation.purge_deleted_cells Experimental Purge deleted cells from the consolidated fragment or not.Default: false
sm.consolidation.step_min_frags The minimum number of fragments to consolidate in a single step.Default: UINT32_MAX
sm.consolidation.step_max_frags The maximum number of fragments to consolidate in a single step.Default: UINT32_MAX
sm.consolidation.step_size_ratio The size ratio that two (“adjacent”) fragments must satisfy to be considered for consolidation in a single step.Default: 0.0
sm.consolidation.timestamp_start Experimental When set, an array will be consolidated between this value and sm.consolidation.timestamp_end
(inclusive).

Only for
fragments and array_meta consolidation mode. Default: 0
sm.consolidation.timestamp_end Experimental When set, an array will be consolidated between sm.consolidation.timestamp_start
and this value (inclusive).

Only for
fragments and array_meta consolidation mode. Default: UINT64_MAX
sm.encryption_key The key for encrypted arrays. Default: “”
sm.encryption_type The type of encryption used for encrypted arrays. Default: “NO_ENCRYPTION”
sm.enumerations_max_size
Maximum in memory size for an enumeration. If the enumeration is

var sized, the size will include the data and the offsets.
Default: 10MB
sm.enumerations_max_total_size
Maximum in memory size for all enumerations. If the enumeration

is var sized, the size will include the data and the offsets.
Default: 50MB
sm.max_tile_overlap_size
Maximum size for the tile overlap structure which holds

information about which tiles are covered by ranges. Only used
in dense reads and legacy reads. Default: 300MB
sm.memory_budget The memory budget for tiles of fixed-sized attributes (or offsets for var-sized attributes) to be fetched during reads.Default: 5GB
sm.memory_budget_var The memory budget for tiles of var-sized attributes to be fetched during reads.Default: 10GB
sm.var_offsets.bitsize The size of offsets in bits to be used for offset buffers of var-sized attributesDefault: 64
sm.var_offsets.extra_element Add an extra element to the end of the offsets buffer of var-sized attributes which will point to the end of the values buffer.Default: false
sm.var_offsets.mode The offsets format (bytes or elements) to be used for var-sized attributes.Default: bytes
sm.query.dense.reader Which reader to use for dense queries. “refactored” or “legacy”.Default: refactored
sm.query.dense.qc_coords_mode
Dense configuration that allows to only return the coordinates of

the cells that match a query condition without any attribute data.
Default: “false”
sm.query.sparse_global_order.reader Which reader to use for sparse global order queries. “refactored” or “legacy”.Default: refactored
sm.query.sparse_global_order.preprocess_tile_merge Experimental for testing purposes, do not use. Performance configuration for sparse global order read queries. If nonzero, prior to loading the first tiles, the reader will run a preprocessing step to arrange tiles from all fragments in a single globally ordered list. This is expected to improve performance when there are many fragments or when the distribution in space of the tiles amongst the fragments is skewed. The value of the parameter specifies the amount of work per parallel task. Default: “32768”
sm.query.sparse_unordered_with_dups.reader Which reader to use for sparse unordered with dups queries. “refactored” or “legacy”.Default: refactored
sm.skip_checksum_validation Skip checksum validation on reads for the md5 and sha256 filters. Default: “false”
sm.mem.malloc_trim Should malloc_trim be called on context and query destruction? This might reduce residual memory usage. Default: true
sm.mem.tile_upper_memory_limit Experimental This is the upper memory limit that is used when loading tiles. For now it is only used in the dense reader but will be eventually used by all readers. The readers using this value will use it as a way to limit the amount of tile data that is brought into memory at once so that we don’t incur performance penalties during memory movement operations. It is a soft limit that we might go over if a single tile doesn’t fit into memory, we will allow to load that tile if it still fits within sm.mem.total_budget. Default: 1GB
sm.mem.total_budget Memory budget for readers and writers. Default: 10GB
sm.mem.consolidation.buffers_weight Weight used to split sm.mem.total_budget and assign to the consolidation buffers. The budget is split across 3 values, sm.mem.consolidation.buffers_weight, sm.mem.consolidation.reader_weight and sm.mem.consolidation.writer_weight. Default: 1
sm.mem.consolidation.reader_weight Weight used to split sm.mem.total_budget and assign to the reader query. The budget is split across 3 values, sm.mem.consolidation.buffers_weight, sm.mem.consolidation.reader_weight and sm.mem.consolidation.writer_weight. Default: 3
sm.mem.consolidation.writer_weight Weight used to split sm.mem.total_budget and assign to the writer query. The budget is split across 3 values, sm.mem.consolidation.buffers_weight, sm.mem.consolidation.reader_weight and sm.mem.consolidation.writer_weight. Default: 2
sm.mem.reader.sparse_global_order.ratio_coords Ratio of the budget allocated for coordinates in the sparse global order reader. Default: 0.5
sm.mem.reader.sparse_global_order.ratio_tile_ranges Ratio of the budget allocated for tile ranges in the sparse global order reader. Default: 0.1
sm.mem.reader.sparse_global_order.ratio_array_data Ratio of the budget allocated for array data in the sparse global order reader. Default: 0.1
sm.mem.reader.sparse_unordered_with_dups.ratio_coords Ratio of the budget allocated for coordinates in the sparse unordered with duplicates reader. Default: 0.5
sm.mem.reader.sparse_unordered_with_dups.ratio_tile_ranges Ratio of the budget allocated for tile ranges in the sparse unordered with duplicates reader. Default: 0.1
sm.mem.reader.sparse_unordered_with_dups.ratio_array_data Ratio of the budget allocated for array data in the sparse unordered with duplicates reader. Default: 0.1 The maximum byte size to read-ahead from the backend. Default: 102400
sm.group.timestamp_start The start timestamp used for opening the group. Default: 0
sm.group.timestamp_end
The end timestamp used for opening the group.

Also used for the write timestamp if set.
Default: UINT64_MAX
sm.partial_tile_offsets_loading Experimental If true tile offsets can be partially loaded and unloaded by the readers. Default: false
sm.fragment_info.preload_mbrs If true MBRs will be loaded at the same time as the rest of fragment info, otherwise they will be loaded lazily when some info related to MBRs is requested by the user. Default: false
sm.partial_tile_offset_loading Experimental If true tile offsets can be partially loaded and unloaded by the readers. Default: false
ssl.ca_file
The path to CA certificate to use when validating server certificates. Applies to all SSL/TLS connections.

This option might be ignored on platforms that have native certificate stores like Windows.
Default: “”
ssl.ca_path
The path to a directory with CA certificates to use when validating server certificates. Applies to all SSL/TLS connections.

This option might be ignored on platforms that have native certificate stores like Windows.
Default: “”
ssl.verify
Whether to verify the server’s certificate. Applies to all SSL/TLS connections.

Disabling verification is insecure and should only used for testing purposes.
Default: true
vfs.read_ahead_cache_size The the total maximum size of the read-ahead cache, which is an LRU. Default: 10485760
vfs.log_operations Enables logging all VFS operations in trace mode. Default: false
vfs.min_parallel_size The minimum number of bytes in a parallel VFS operation (except parallel S3 writes, which are controlled by vfs.s3.multipart_part_size). Default: 10MB
vfs.max_batch_size The maximum number of bytes in a VFS read operationDefault: 100MB
vfs.min_batch_size The minimum number of bytes in a VFS read operationDefault: 20MB
vfs.min_batch_gap The minimum number of bytes between two VFS read batches.Default: 500KB
vfs.read_logging_mode Log read operations at varying levels of verbosity.Default: “” Possible values:
- An empty string disables read logging.
- Log each fragment read.
- Log each individual fragment file read.
- Log all files read.
- Log all files with offset and length parameters.
- Log all files with offset and length parameters on every read, not just the first read. On large arrays the read cache may get large so this trades of RAM usage vs increased log verbosity.
vfs.file.posix_file_permissions Permissions to use for posix file system with file creation.Default: 644
vfs.file.posix_directory_permissions Permissions to use for posix file system with directory creation.Default: 755
vfs.azure.storage_account_name Set the name of the Azure Storage account to use. Default: “”
vfs.azure.storage_account_key Set the Shared Key to authenticate to Azure Storage. Default: “”
vfs.azure.storage_sas_token Set the Azure Storage SAS (shared access signature) token to use. If this option is set along with vfs.azure.blob_endpoint, the latter must not include a SAS token. Default: “”
vfs.azure.blob_endpoint
Set the default Azure Storage Blob endpoint.

If not specified, it will take a value of
<account-name>.blob.core.windows.net, where <account-name> is the value of the vfs.azure.storage_account_name option. This means that at least one of these two options must be set (or both if shared key authentication is used). Default: “”
vfs.azure.is_data_lake_endpoint Sets whether the Azure Storage account is known to have hierarchical namespace enabled or disabled. This option can be used to reduce latency when performing the first Azure request. If not specified, the account’s capabilities will be automatically detected. Default: <unset>
vfs.azure.block_list_block_size The block size (in bytes) used in Azure blob block list writes. Any uint64_t value is acceptable. Note: vfs.azure.block_list_block_size vfs.azure.max_parallel_ops bytes will be buffered before issuing block uploads in parallel. Default: “5242880”
vfs.azure.max_parallel_ops The maximum number of Azure backend parallel operations. Default: sm.io_concurrency_level
vfs.azure.use_block_list_upload Determines if the Azure backend can use chunked block uploads. Default: “true”
vfs.azure.max_retries The maximum number of times to retry an Azure network request. Default: 5
vfs.azure.retry_delay_ms The minimum permissible delay between Azure netwwork request retry attempts, in milliseconds. Default: 800
vfs.azure.max_retry_delay_ms The maximum permissible delay between Azure netwwork request retry attempts, in milliseconds. Default: 60000
vfs.gcs.endpoint The GCS endpoint. Default: “”
vfs.gcs.project_id Set the GCS project ID to create new buckets to. Not required unless you are going to use the VFS to create buckets. Default: “”
vfs.gcs.service_account_key Experimental Set the JSON string with GCS service account key. Takes precedence over vfs.gcs.workload_identity_configuration if both are specified. If neither is specified, Application Default Credentials will be used. Default: “”
vfs.gcs.workload_identity_configuration Experimental Set the JSON string with Workload Identity Federation configuration. vfs.gcs.service_account_key takes precedence over this if both are specified. If neither is specified, Application Default Credentials will be used. Default: “”
vfs.gcs.impersonate_service_account Experimental Set the GCS service account to impersonate. A chain of impersonated accounts can be formed by specifying many service accounts, separated by a comma. Default: “”
vfs.gcs.multi_part_size The part size (in bytes) used in GCS multi part writes. Any uint64_t value is acceptable. Note: vfs.gcs.multi_part_size * vfs.gcs.max_parallel_ops bytes will be buffered before issuing part uploads in parallel. Default: “5242880”
vfs.gcs.max_parallel_ops The maximum number of GCS backend parallel operations. Default: sm.io_concurrency_level
vfs.gcs.use_multi_part_upload Determines if the GCS backend can use chunked part uploads. Default: “true”
vfs.gcs.request_timeout_ms The maximum amount of time to retry network requests to GCS. Default: “3000”
vfs.gcs.max_direct_upload_size The maximum size in bytes of a direct upload to GCS. Ignored if vfs.gcs.use_multi_part_upload is set to true. Default: “10737418240”
vfs.s3.region
The S3 region, if S3 is enabled.

If empty, the region will be determined by the AWS SDK using sources such as environment variables, profile configuration, or instance metadata.
Default: “”
vfs.s3.aws_access_key_id Set the AWS_ACCESS_KEY_ID Default: “”
vfs.s3.aws_secret_access_key Set the AWS_SECRET_ACCESS_KEY Default: “”
vfs.s3.aws_session_token Set the AWS_SESSION_TOKEN Default: “”
vfs.s3.aws_role_arn Determines the role that we want to assume. Set the AWS_ROLE_ARN Default: “”
vfs.s3.aws_external_id Third party access ID to your resources when assuming a role. Set the AWS_EXTERNAL_ID Default: “”
vfs.s3.aws_load_frequency Session time limit when assuming a role. Set the AWS_LOAD_FREQUENCY Default: “”
vfs.s3.aws_session_name (Optional) session name when assuming a role. Can be used for tracing and bookkeeping. Set the AWS_SESSION_NAME Default: “”
vfs.s3.scheme The S3 scheme (http or https), if S3 is enabled. Default: https
vfs.s3.endpoint_override The S3 endpoint, if S3 is enabled. Default: “”
vfs.s3.use_virtual_addressing The S3 use of virtual addressing (true or false), if S3 is enabled. Default: true
vfs.s3.skip_init Skip Aws::InitAPI for the S3 layer (true or false) Default: false
vfs.s3.use_multipart_upload The S3 use of multi-part upload requests (true or false), if S3 is enabled. Default: true
vfs.s3.max_parallel_ops The maximum number of S3 backend parallel operations. Default: sm.io_concurrency_level
vfs.s3.multipart_part_size The part size (in bytes) used in S3 multipart writes. Any uint64_t value is acceptable. Note: vfs.s3.multipart_part_size * vfs.s3.max_parallel_ops bytes will be buffered before issuing multipart uploads in parallel. Default: 5MB
vfs.s3.ca_file Path to SSL/TLS certificate file to be used by cURL for for S3 HTTPS encryption. Follows cURL conventions: https://curl.haxx.se/docs/manpage.html Default: “”
vfs.s3.ca_path Path to SSL/TLS certificate directory to be used by cURL for S3 HTTPS encryption. Follows cURL conventions: https://curl.haxx.se/docs/manpage.html Default: “”
vfs.s3.connect_timeout_ms The connection timeout in ms. Any long value is acceptable. Default: 10800
vfs.s3.connect_max_tries The maximum tries for a connection. Any long value is acceptable. Default: 5
vfs.s3.connect_scale_factor The scale factor for exponential backoff when connecting to S3. Any long value is acceptable. Default: 25
vfs.s3.custom_headers.* (Optional) Prefix for custom headers on s3 requests. For each custom header, use “vfs.s3.custom_headers.header_key” = “header_value” Optional. No Default
vfs.s3.logging_level The AWS SDK logging level. This is a process-global setting. The configuration of the most recently constructed context will set process state. Log files are written to the process working directory. Default: “Off”
vfs.s3.request_timeout_ms The request timeout in ms. Any long value is acceptable. Default: 3000
vfs.s3.requester_pays The requester pays for the S3 access charges. Default: false
vfs.s3.proxy_host The S3 proxy host. Default: “”
vfs.s3.proxy_port The S3 proxy port. Default: 0
vfs.s3.proxy_scheme The S3 proxy scheme. Default: “http”
vfs.s3.proxy_username The S3 proxy username. Note: this parameter is not serialized by tiledb_config_save_to_file. Default: “”
vfs.s3.proxy_password The S3 proxy password. Note: this parameter is not serialized by tiledb_config_save_to_file. Default: “”
vfs.s3.verify_ssl Enable HTTPS certificate verification. Default: true””
vfs.s3.no_sign_request Make unauthenticated requests to s3. Default: false
vfs.s3.sse The server-side encryption algorithm to use. Supported non-empty values are “aes256” and “kms” (AWS key management service). Default: “”
vfs.s3.sse_kms_key_id The server-side encryption key to use if vfs.s3.sse == “kms” (AWS key management service). Default: “”
vfs.s3.storage_class The storage class to use for the newly uploaded S3 objects. The set of accepted values is found in the Aws::S3::Model::StorageClass enumeration. “NOT_SET” “STANDARD” “REDUCED_REDUNDANCY” “STANDARD_IA” “ONEZONE_IA” “INTELLIGENT_TIERING” “GLACIER” “DEEP_ARCHIVE” “OUTPOSTS” “GLACIER_IR” “SNOW” “EXPRESS_ONEZONE” Default: “NOT_SET”
vfs.s3.bucket_canned_acl Names of values found in Aws::S3::Model::BucketCannedACL enumeration. “NOT_SET” “private_” “public_read” “public_read_write” “authenticated_read” Default: “NOT_SET”
vfs.s3.object_canned_acl Names of values found in Aws::S3::Model::ObjectCannedACL enumeration. (The first 5 are the same as for “vfs.s3.bucket_canned_acl”.) “NOT_SET” “private_” “public_read” “public_read_write” “authenticated_read” (The following three items are found only in Aws::S3::Model::ObjectCannedACL.) “aws_exec_read” “owner_read” “bucket_owner_full_control” Default: “NOT_SET”
vfs.s3.config_source Force S3 SDK to only load config options from a set source. The supported options are auto (TileDB config options are considered first, then SDK-defined precedence: env vars, config files, ec2 metadata), config_files (forces SDK to only consider options found in aws config files), sts_profile_with_web_identity (force SDK to consider assume roles/sts from config files with support for web tokens, commonly used by EKS/ECS). Default: auto
vfs.s3.install_sigpipe_handler When set to true, the S3 SDK uses a handler that ignores SIGPIPE signals. Default: “true”
config.env_var_prefix Prefix of environment variables for reading configuration parameters. Default: “TILEDB_”
config.logging_level The logging level configured, possible values: “0”: fatal, “1”: error, “2”: warn, “3”: info “4”: debug, “5”: trace Default: “1” if —enable-verbose bootstrap flag is provided, “0” otherwise
config.logging_format The logging format configured (DEFAULT or JSON) Default: “DEFAULT”
profile_name The name of the Profile to be used for REST configuration. Default: “”
profile_dir The directory where the user profiles are stored. Default: “”
rest.server_address URL for REST server to use for remote arrays. Default: “https://api.tiledb.com”
rest.server_serialization_format Serialization format to use for remote array requests (CAPNP or JSON). Default: “CAPNP”
rest.username Username for login to REST server. Default: “”
rest.password Password for login to REST server. Default: “”
rest.token Authentication token for REST server (used instead of username/password). Default: “”
rest.resubmit_incomplete If true, incomplete queries received from server are automatically resubmitted before returning to user control. Default: “true”
rest.ignore_ssl_validation Have curl ignore ssl peer and host validation for REST server. Default: false
rest.creation_access_credentials_name The name of the registered access key to use for creation of the REST server. Default: no default set
rest.retry_http_codes CSV list of http status codes to automatically retry a REST request for Default: “503”
rest.retry_count Number of times to retry failed REST requests Default: 25
rest.retry_initial_delay_ms Initial delay in milliseconds to wait until retrying a REST request Default: 500
rest.retry_delay_factor The delay factor to exponentially wait until further retries of a failed REST request Default: 1.25
rest.curl.retry_errors If true any curl requests that returned an error will be retried Default: true
rest.curl.verbose
Set curl to run in verbose mode for REST requests

curl will print to stdout with this option
Default: false
rest.curl.tcp_keepalive Set curl to use TCP keepalive for REST requests Default: true
rest.load_metadata_on_array_open If true, array metadata will be loaded and sent to server together with the open array Default: true
rest.load_non_empty_domain_on_array_open If true, array non empty domain will be loaded and sent to server together with the open array Default: true
rest.load_enumerations_on_array_open If true, enumerations will be loaded for the latest array schema and sent to server together with the open array. Default: false
rest.load_enumerations_on_array_open_all_schemas If true, enumerations will be loaded for all array schemas and sent to server together with the open array. Default: false
rest.use_refactored_array_open If true, the new REST routes and APIs for opening an array will be used Default: true
rest.use_refactored_array_open_and_query_submit If true, the new REST routes and APIs for opening an array and submitting a query will be used Default: true
rest.curl.buffer_size Set curl buffer size for REST requests Default: 524288 (512KB)
rest.capnp_traversal_limit CAPNP traversal limit used in the deserialization of messages(bytes) Default: 2147483648 (2GB)
rest.custom_headers.* (Optional) Prefix for custom headers on REST requests. For each custom header, use “rest.custom_headers.header_key” = “header_value” Optional. No Default
rest.payer_namespace The namespace that should be charged for the request. Default: no default set
filestore.buffer_size Specifies the size in bytes of the internal buffers used in the filestore API. The size should be bigger than the minimum tile size filestore currently supports, that is currently 1024bytes. Default: 100MB

inline std::string get(const std::string &param) const¶

Get a parameter from the configuration by key.

Parameters:: param – Name of configuration parameter
Throws:: TileDBError – if the parameter does not exist
Returns:: Value of configuration parameter

inline bool contains(const std::string_view &param) const¶

Check if a configuration parameter exists.

Parameters:: param – Name of configuration parameter
Returns:: true if the parameter exists, false otherwise

inline impl::ConfigProxy operator[](const std::string &param)¶

Operator that enables setting parameters with [].

Example:

Config conf;
conf["vfs.s3.region"] = "us-east-1a";
conf["vfs.s3.use_virtual_addressing"] = "true";
Context ctx(conf);

Parameters:: param – Name of parameter to set
Returns:: “Proxy” object supporting assignment.

inline Config &unset(const std::string &param)¶

Resets a config parameter to its default value.

Parameters:: param – Name of parameter
Returns:: Reference to this Config instance

inline iterator begin(const std::string &prefix)¶

Iterate over params starting with a prefix.

Example:

tiledb::Config config;
for (auto it = config.begin("vfs"), ite = config.end(); it != ite; ++it) {
  std::string name = it->first, value = it->second;
}

Parameters:: prefix – Prefix to iterate over
Returns:: iterator

inline iterator begin()¶

Iterate over all params.

Example:

tiledb::Config config;
for (auto it = config.begin(), ite = config.end(); it != ite; ++it) {
  std::string name = it->first, value = it->second;
}

Returns:: iterator

inline iterator end()¶: End iterator.

Public Static Functions

static inline void free(tiledb_config_t *config)¶: Wrapper function for freeing a config C object.

Exceptions¶

struct TileDBError : public std::runtime_error¶

Exception indicating a TileDB error.

Subclassed by tiledb::AttributeError, tiledb::ProfileException, tiledb::SchemaMismatch, tiledb::TypeError

struct TypeError : public tiledb::TileDBError ¶

Exception indicating a mismatch between a static and runtime type

Subclassed by tiledb::FilterOptionTypeError< Expected, Actual >

struct SchemaMismatch : public tiledb::TileDBError ¶: Exception indicating the requested operation does not match array schema

struct AttributeError : public tiledb::TileDBError ¶: Error related to attributes

Dimension¶

class Dimension¶

Describes one dimension of an Array. The dimension consists of a type, lower and upper bound, and tile-extent describing the memory ordering. Dimensions are added to a Domain.

Example:

tiledb::Context ctx;
tiledb::Domain domain(ctx);
// Create a dimension with inclusive domain [0,1000] and tile extent 100.
domain.add_dimension(Dimension::create<int32_t>(ctx, "d", {{0, 1000}}, 100));

Note: as laid out in the Storage Format, the following Datatypes are not valid for Dimension: TILEDB_CHAR, TILEDB_BLOB, TILEDB_GEOM_WKB, TILEDB_GEOM_WKT, TILEDB_BOOL, TILEDB_STRING_UTF8, TILEDB_STRING_UTF16, TILEDB_STRING_UTF32, TILEDB_STRING_UCS2, TILEDB_STRING_UCS4, TILEDB_ANY

Public Functions

inline unsigned cell_val_num() const¶: Returns number of values of one cell on this dimension. For variable-sized dimensions returns TILEDB_VAR_NUM.

inline Dimension &set_cell_val_num(unsigned num)¶: Sets the number of values per coordinate.

inline FilterList filter_list() const¶: Returns a copy of the FilterList of the dimemnsion. To change the filter list, use set_filter_list().

inline Dimension &set_filter_list(const FilterList &filter_list)¶: Sets the dimension filter list, which is an ordered list of filters that will be used to process and/or transform the coordinate data (such as compression).

inline const std::string name() const¶: Returns the name of the dimension.

inline tiledb_datatype_t type() const¶: Returns the dimension datatype.

template<typename T> inline std::pair<T, T> domain() const¶

Returns the domain of the dimension.

Template Parameters:: T – Domain datatype
Returns:: Pair of [lower, upper] inclusive bounds.

inline std::string domain_to_str() const¶

Returns a string representation of the domain.

Throws:: TileDBError – if the domain cannot be stringified (TILEDB_ANY)

template<typename T> inline T tile_extent() const¶: Returns the tile extent of the dimension.

inline std::string tile_extent_to_str() const¶

Returns a string representation of the extent.

Throws:: TileDBError – if the domain cannot be stringified (TILEDB_ANY)

inline std::shared_ptr<tiledb_dimension_t> ptr() const¶: Returns a shared pointer to the C TileDB dimension object.

Public Static Functions

template<typename T> static inline Dimension create(const Context &ctx, const std::string &name, const std::array<T, 2> &domain, T extent)¶

Factory function for creating a new dimension with datatype T.

Example:

tiledb::Context ctx;
// Create a dimension with inclusive domain [0,1000] and tile extent 100.
auto dim = Dimension::create<int32_t>(ctx, "d", {{0, 1000}}, 100);

Template Parameters:

T – int, char, etc…

Parameters:

ctx – The TileDB context.
name – The dimension name.
domain – The dimension domain. A pair [lower,upper] of inclusive bounds.
extent – The tile extent on the dimension.

Returns:

A new Dimension object.

template<typename T> static inline Dimension create(const Context &ctx, const std::string &name, const std::array<T, 2> &domain)¶

Factory function for creating a new dimension with datatype T and without specifying a tile extent.

Example:

tiledb::Context ctx;
// Create a dimension with inclusive domain [0,1000] and no tile extent.
auto dim = Dimension::create<int32_t>(ctx, "d", {{0, 1000}});

Template Parameters:

T – int, char, etc…

Parameters:

ctx – The TileDB context.
name – The dimension name.
domain – The dimension domain. A pair [lower,upper] of inclusive bounds.

Returns:

A new Dimension object.

static inline Dimension create(const Context &ctx, const std::string &name, tiledb_datatype_t datatype, const void *domain, const void *extent)¶

Factory function for creating a new dimension (non typechecked).

Parameters:

ctx – The TileDB context.
name – The dimension name.
datatype – The dimension datatype.
domain – The dimension domain. A pair [lower,upper] of inclusive bounds.
extent – The tile extent on the dimension.

Returns:

A new Dimension object.

Domain¶

class Domain¶

Represents the domain of an array.

A Domain defines the set of Dimension objects for a given array. The properties of a Domain derive from the underlying dimensions. A Domain is a component of an ArraySchema.

Example:

tiledb::Context ctx;
tiledb::Domain domain;

// Note the dimension bounds are inclusive.
auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10});
auto d2 = tiledb::Dimension::create<uint64_t>(ctx, "d2", {1, 10});
auto d3 = tiledb::Dimension::create<int>(ctx, "d3", {-100, 100});

domain.add_dimension(d1);
domain.add_dimension(d2); // Throws error, all dims must be same type
domain.add_dimension(d3);

domain.cell_num(); // (10 - -10 + 1) * (10 - 1 + 1) = 210 max cells
domain.type(); // TILEDB_INT32, determined from the dimensions
domain.rank(); // 2, d1 and d2

tiledb::ArraySchema schema(ctx, TILEDB_DENSE);
schema.set_domain(domain); // Set the array's domain

Note

The dimension can only be signed or unsigned integral types, as well as floating point for sparse array domains.

Public Functions

inline const Context &context() const¶: Returns the context that the domain belongs to.

inline uint64_t cell_num() const¶

Returns the total number of cells in the domain. Throws an exception if the domain type is float32 or float64.

Throws:: TileDBError – if cell_num cannot be computed.

inline tiledb_datatype_t type() const¶: Returns the domain type.

inline unsigned ndim() const¶: Returns the number of dimensions.

inline std::vector<Dimension> dimensions() const¶: Returns the current set of dimensions in the domain.

inline Dimension dimension(unsigned idx) const¶: Returns the dimensions with the given index.

inline Dimension dimension(const std::string &name) const¶: Returns the dimensions with the given name.

inline Domain &add_dimension(const Dimension &d)¶

Adds a new dimension to the domain.

Example:

tiledb::Context ctx;
tiledb::Domain domain;
auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10});
domain.add_dimension(d1);

Parameters:: d – Dimension to add
Returns:: Reference to this Domain

template<typename ...Args> inline Domain &add_dimensions(Args... dims)¶

Adds multiple dimensions to the domain.

Example:

tiledb::Context ctx;
tiledb::Domain domain;
auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10});
auto d2 = tiledb::Dimension::create<int>(ctx, "d2", {1, 10});
auto d3 = tiledb::Dimension::create<int>(ctx, "d3", {-100, 100});
domain.add_dimensions(d1, d2, d3);

Template Parameters:: Args – Variadic dimension datatype
Parameters:: dims – Dimensions to add
Returns:: Reference to this Domain.

inline bool has_dimension(const std::string &name) const¶

Checks if the domain has a dimension of the given name.

Parameters:: name – Name of dimension to check for
Returns:: True if the domain has a dimension of the given name.

inline std::shared_ptr<tiledb_domain_t> ptr() const¶: Returns a shared pointer to the C TileDB domain object.

Attribute¶

class Attribute¶

Describes an attribute of an Array cell.

An attribute specifies a name and datatype for a particular value in each array cell. There are 3 supported attribute types:

Fundamental types, such as char, int, double, uint64_t, etc..
Fixed sized arrays: T[N] or std::array<T, N>, where T is a fundamental type
Variable length data: std::string, std::vector<T> where T is a fundamental type

Fixed-size array types using POD types like std::array<T, N> are internally converted to byte-array attributes. E.g. an attribute of type std::array<float, 3> will be created as an attribute of type TILEDB_CHAR with cell_val_num sizeof(std::array<float, 3>).

Therefore, for fixed-length attributes it is recommended to use C-style arrays instead, e.g. float[3] instead of std::array<float, 3>.

Example:

tiledb::Context ctx;
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3");

// Change compression scheme
tiledb::FilterList filters(ctx);
filters.add_filter({ctx, TILEDB_FILTER_BZIP2});
a1.set_filter_list(filters);

// Add attributes to a schema
tiledb::ArraySchema schema(ctx, TILEDB_DENSE);
schema.add_attributes(a1, a2, a3);

Public Functions

inline Attribute(const Context &ctx, const std::string &name, tiledb_datatype_t type)¶

Construct an attribute with a name and enumerated type. cell_val_num will be set to 1.

Parameters:

ctx – TileDB context
name – Name of attribute
type – Enumerated type of attribute

inline Attribute(const Context &ctx, const std::string &name, tiledb_datatype_t type, const FilterList &filter_list)¶: Construct an attribute with an enumerated type and given filter list.

inline std::string name() const¶: Returns the name of the attribute.

inline const Context &context() const¶: Returns the context that the attribute belongs to.

inline tiledb_datatype_t type() const¶: Returns the attribute datatype.

inline uint64_t cell_size() const¶

Returns the size (in bytes) of one cell on this attribute. For variable-sized attributes returns TILEDB_VAR_NUM.

Example:

tiledb::Context ctx;
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3");
auto a4 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a4");
a1.cell_size();    // Returns sizeof(int)
a2.cell_size();    // Variable sized attribute, returns TILEDB_VAR_NUM
a3.cell_size();    // Returns 3 * sizeof(float)
a4.cell_size();    // Stored as byte array, returns sizeof(char).

inline unsigned cell_val_num() const¶

Returns number of values of one cell on this attribute. For variable-sized attributes returns TILEDB_VAR_NUM.

Example:

tiledb::Context ctx;
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3");
auto a4 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a4");
a1.cell_val_num();   // Returns 1
a2.cell_val_num();   // Variable sized attribute, returns TILEDB_VAR_NUM
a3.cell_val_num();   // Returns 3
a4.cell_val_num();   // Stored as byte array, returns
                        sizeof(std::array<float, 3>).

inline Attribute &set_cell_val_num(unsigned num)¶

Sets the number of attribute values per cell. This is inferred from the type parameter of the Attribute::create<T>() function, but can also be set manually.

Example:

// a1 and a2 are equivalent:
auto a1 = Attribute::create<std::vector<int>>(...);
auto a2 = Attribute::create<int>(...);
a2.set_cell_val_num(TILEDB_VAR_NUM);

Parameters:: num – Cell val number to set.
Returns:: Reference to this Attribute

inline Attribute &set_fill_value(const void *value, uint64_t size)¶

Sets the default fill value for the input attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).

Applicable to var-sized attributes.

Example:

tiledb::Context ctx;

// Fixed-sized attribute
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
int32_t value = 0;
uint64_t size = sizeof(value);
a1.set_fill_value(&value, size);

// Var-sized attribute
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
std::string value("null");
a2.set_fill_value(value.c_str(), value.size());

Note

A call to cell_val_num sets the fill value of the attribute to its default. Therefore, make sure you invoke set_fill_value after deciding on the number of values this attribute will hold in each cell.

Note

For fixed-sized attributes, the input size should be equal to the cell size.

Parameters:

value – The fill value to set.
size – The fill value size in bytes.

inline void get_fill_value(const void **value, uint64_t *size) const¶

Gets the default fill value for the input attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).

Applicable to both fixed-sized and var-sized attributes.

Example:

// Fixed-sized attribute
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
const int32_t* value;
uint64_t size;
a1.get_fill_value(&value, &size);

// Var-sized attribute
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
const char* value;
uint64_t size;
a2.get_fill_value(&value, &size);

Parameters:

value – A pointer to the fill value to get.
size – The size of the fill value to get.

inline Attribute &set_fill_value(const void *value, uint64_t size, uint8_t valid)¶

Sets the default fill value for the input, nullable attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).

Applicable to var-sized attributes.

Example:

tiledb::Context ctx;

// Fixed-sized attribute
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
a1.set_nullable(true);
int32_t value = 0;
uint64_t size = sizeof(value);
uint8_t valid = 0;
a1.set_fill_value(&value, size, valid);

// Var-sized attribute
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
a2.set_nullable(true);
std::string value("null");
uint8_t valid = 0;
a2.set_fill_value(value.c_str(), value.size(), valid);

Note

A call to cell_val_num sets the fill value of the attribute to its default. Therefore, make sure you invoke set_fill_value after deciding on the number of values this attribute will hold in each cell.

Note

For fixed-sized attributes, the input size should be equal to the cell size.

Parameters:

value – The fill value to set.
size – The fill value size in bytes.
valid – The validity fill value, zero for a null value and non-zero for a valid attribute.

inline void get_fill_value(const void **value, uint64_t *size, uint8_t *valid) const¶

Gets the default fill value for the input attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).

Applicable to both fixed-sized and var-sized attributes.

Example:

// Fixed-sized attribute
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
a1.set_nullable(true);
const int32_t* value;
uint64_t size;
uint8_t valid;
a1.get_fill_value(&value, &size, &valid);

// Var-sized attribute
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
a2.set_nullable(true);
const char* value;
uint64_t size;
uint8_t valid;
a2.get_fill_value(&value, &size, &valid);

Parameters:

value – A pointer to the fill value to get.
size – The size of the fill value to get.
valid – The fill value validity to get.

inline bool variable_sized() const¶: Check if attribute is variable sized.

inline FilterList filter_list() const¶

Returns a copy of the FilterList of the attribute. To change the filter list, use set_filter_list().

Returns:: Copy of the attribute FilterList.

inline Attribute &set_filter_list(const FilterList &filter_list)¶

Sets the attribute filter list, which is an ordered list of filters that will be used to process and/or transform the attribute data (such as compression).

Parameters:: filter_list – Filter list to set
Returns:: Reference to this Attribute

inline Attribute &set_nullable(bool nullable)¶

Sets the nullability of an attribute.

Example:

auto a1 = Attribute::create<int>(...);
a1.set_nullable(true);

Parameters:: nullable – Whether the attribute is nullable.
Returns:: Reference to this Attribute

inline bool nullable() const¶

Gets the nullability of an attribute.

Example:

auto a1 = Attribute::create<int>(...);
auto nullable = a1.nullable();

Returns:: Whether the attribute is nullable.

inline std::shared_ptr<tiledb_attribute_t> ptr() const¶: Returns the C TileDB attribute object pointer.

Public Static Functions

template<typename T> static inline Attribute create(const Context &ctx, const std::string &name)¶

Factory function for creating a new attribute with datatype T.

Example:

tiledb::Context ctx;
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
auto a3 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a3");
auto a4 = tiledb::Attribute::create<std::vector<double>>(ctx, "a4");
auto a5 = tiledb::Attribute::create<char[8]>(ctx, "a5");

Template Parameters:

T – Datatype of the attribute. Can either be arithmetic type, C-style array, std::string, std::vector, or any trivially copyable classes (defined by std::is_trivially_copyable).

Parameters:

ctx – The TileDB context.
name – The attribute name.

Returns:

A new Attribute object.

static inline Attribute create(const Context &ctx, const std::string &name, tiledb_datatype_t type)¶: Factory function taking the type as a tiledb_datatype_t variable.

template<typename T> static inline Attribute create(const Context &ctx, const std::string &name, const FilterList &filter_list)¶

Factory function for creating a new attribute with datatype T and a FilterList.

Example:

tiledb::Context ctx;
tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE})
    .add_filter({ctx, TILEDB_FILTER_BZIP2});
auto a1 = tiledb::Attribute::create<int>(ctx, "a1", filter_list);

Template Parameters:

T – Datatype of the attribute. Can either be arithmetic type, C-style array, std::string, std::vector, or any trivially copyable classes (defined by std::is_trivially_copyable).

Parameters:

ctx – The TileDB context.
name – The attribute name.
filter_list – FilterList to use for attribute

Returns:

A new Attribute object.

Array Schema¶

class ArraySchema : public tiledb::Schema¶

Schema describing an array.

The schema is an independent description of an array. A schema can be used to create multiple array’s, and stores information about its domain, cell types, and compression details. An array schema is composed of:

A Domain
A set of Attributes
Memory layout definitions: tile and cell
Compression details for Array level factors like offsets and coordinates

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); // Or
TILEDB_DENSE

// Create a Domain
tiledb::Domain domain(...);

// Create Attributes
auto a1 = tiledb::Attribute::create(...);

schema.set_domain(domain);
schema.add_attribute(a1);

// Specify tile memory layout
schema.set_tile_order(TILEDB_ROW_MAJOR);
// Specify cell memory layout within each tile
schema.set_cell_order(TILEDB_ROW_MAJOR);
schema.set_capacity(10); // For sparse, set capacity of each tile

// Create the array on persistent storage with the schema.
tiledb::Array::create("my_array", schema);

Public Functions

inline explicit ArraySchema(const Context &ctx, tiledb_array_type_t type)¶

Creates a new array schema.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);

Parameters:

ctx – TileDB context
type – Array type, sparse or dense.

inline ArraySchema(const Context &ctx, const std::string &uri)¶

Loads the schema of an existing array.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx, "s3://bucket-name/array-name");

Parameters:

ctx – TileDB context
uri – URI of array

inline ArraySchema(const Context &ctx, tiledb_array_schema_t *schema)¶

Loads the schema of an existing array with the input C array schema object.

Parameters:

ctx – TileDB context
schema – C API array schema object

inline tiledb_array_type_t array_type() const¶: Returns the array type.

inline uint64_t capacity() const¶: Returns the tile capacity.

inline ArraySchema &set_capacity(uint64_t capacity)¶

Sets the tile capacity.

Parameters:: capacity – The capacity of a sparse data tile. Note that sparse data tiles exist in sparse fragments, which can be created in sparse arrays only. For more details, see tutorials/tiling-sparse.html.
Returns:: Reference to this ArraySchema instance.

inline bool allows_dups() const¶: Returns true if the array allows coordinate duplicates.

inline ArraySchema &set_allows_dups(bool allows_dups)¶: Sets whether the array allows coordinate duplicates. It throws an exception in case it sets true to a dense array.

inline uint32_t version() const¶: Returns the version of the array schema object.

inline tiledb_layout_t tile_order() const¶: Returns the tile order.

inline ArraySchema &set_tile_order(tiledb_layout_t layout)¶

Sets the tile order.

Parameters:: layout – Tile order to set.
Returns:: Reference to this ArraySchema instance.

inline ArraySchema &set_order(const std::array<tiledb_layout_t, 2> &p)¶

Sets both the tile and cell orders.

Parameters:: layout – Pair of {tile order, cell order}
Returns:: Reference to this ArraySchema instance.

inline tiledb_layout_t cell_order() const¶: Returns the cell order.

inline ArraySchema &set_cell_order(tiledb_layout_t layout)¶

Sets the cell order.

Parameters:: layout – Cell order to set.
Returns:: Reference to this ArraySchema instance.

inline FilterList coords_filter_list() const¶

Returns a copy of the FilterList of the coordinates. To change the coordinate compressor, use set_coords_filter_list().

Returns:: Copy of the coordinates FilterList.

inline ArraySchema &set_coords_filter_list(const FilterList &filter_list)¶

Sets the FilterList for the coordinates, which is an ordered list of filters that will be used to process and/or transform the coordinate data (such as compression).

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE})
    .add_filter({ctx, TILEDB_FILTER_BZIP2});
schema.set_coords_filter_list(filter_list);

Parameters:: filter_list – FilterList to use
Returns:: Reference to this ArraySchema instance.

inline FilterList offsets_filter_list() const¶

Returns a copy of the FilterList of the offsets. To change the offsets compressor, use set_offsets_filter_list().

Returns:: Copy of the offsets FilterList.

inline FilterList validity_filter_list() const¶

Returns a copy of the FilterList of the validity arrays. To change the validity compressor, use set_validity_filter_list().

Returns:: Copy of the validity FilterList.

inline ArraySchema &set_offsets_filter_list(const FilterList &filter_list)¶

Sets the FilterList for the offsets, which is an ordered list of filters that will be used to process and/or transform the offsets data (such as compression).

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_POSITIVE_DELTA})
    .add_filter({ctx, TILEDB_FILTER_LZ4});
schema.set_offsets_filter_list(filter_list);

Parameters:: filter_list – FilterList to use
Returns:: Reference to this ArraySchema instance.

inline ArraySchema &set_validity_filter_list(const FilterList &filter_list)¶

Sets the FilterList for the validity arrays, which is an ordered list of filters that will be used to process and/or transform the validity data (such as compression).

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_POSITIVE_DELTA})
    .add_filter({ctx, TILEDB_FILTER_LZ4});
schema.set_validity_filter_list(filter_list);

Parameters:: filter_list – FilterList to use
Returns:: Reference to this ArraySchema instance.

inline Domain domain() const¶

Returns a copy of the schema’s array Domain. To change the domain, use set_domain().

Returns:: Copy of the array Domain

inline ArraySchema &set_domain(const Domain &domain)¶

Sets the array domain.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
// Create a Domain
tiledb::Domain domain(...);
schema.set_domain(domain);

Parameters:: domain – Domain to use
Returns:: Reference to this ArraySchema instance.

inline std::pair<uint64_t, uint64_t> timestamp_range()¶

Get timestamp range of schema.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
std::pair<uint64_t, uint64_t> timestamp_range = schema.timestamp_range();

Returns:: Timestamp range of this ArraySchema instance.

inline virtual ArraySchema &add_attribute(const Attribute &attr) override¶

Adds an Attribute to the array.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
schema.add_attribute(Attribute::create<int32_t>(ctx.ptr().get(),
"attr_name"));

Parameters:: attr – The Attribute to add
Returns:: Reference to this ArraySchema instance.

inline std::shared_ptr<tiledb_array_schema_t> ptr() const¶: Returns a shared pointer to the C TileDB domain object.

inline virtual void check() const override¶

Validates the schema.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
// Add domain, attributes, etc...

try {
  schema.check();
} catch (const tiledb::TileDBError& e) {
  std::cout << e.what() << "\n";
  exit(1);
}

Throws:: TileDBError – if the schema is incorrect or invalid.

inline virtual std::unordered_map<std::string, Attribute> attributes() const override¶

Gets all attributes in the array.

Returns:: Map of attribute name to copy of Attribute instance.

inline virtual Attribute attribute(const std::string &name) const override¶

Get a copy of an Attribute in the schema by name.

Parameters:: name – Name of attribute
Returns:: Attribute

inline virtual unsigned attribute_num() const override¶: Returns the number of attributes in the schema.

inline virtual Attribute attribute(unsigned int i) const override¶

Get a copy of an Attribute in the schema by index. Attributes are ordered the same way they were defined when constructing the array schema.

Parameters:: i – Index of attribute
Returns:: Attribute

inline bool has_attribute(const std::string &name) const¶

Checks if the schema has an attribute of the given name.

Parameters:: name – Name of attribute to check for
Returns:: True if the schema has an attribute of the given name.

Public Static Functions

static inline std::string to_str(tiledb_array_type_t type)¶: Returns the input array type in string format.

static inline std::string to_str(tiledb_layout_t layout)¶: Returns the input layout in string format.

Array¶

class Array¶

Class representing a TileDB array object.

An Array object represents array data in TileDB at some persisted location, e.g. on disk, in an S3 bucket, etc. Once an array has been opened for reading or writing, interact with the data through Query objects.

Example:

tiledb::Context ctx;

// Create an ArraySchema, add attributes, domain, etc.
tiledb::ArraySchema schema(...);

// Create empty array named "my_array" on persistent storage.
tiledb::Array::create("my_array", schema);

Public Functions

inline Array(const Context &ctx, const std::string &array_uri, tiledb_query_type_t query_type, const TemporalPolicy temporal_policy = {}, const EncryptionAlgorithm encryption_algorithm = {})¶

Constructor. This opens the array for the given query type. The destructor calls the close() method.

Example:

// Open the array for reading
tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);

Parameters:

ctx – TileDB context.
array_uri – The array URI.
query_type – Query type to open the array for.
temporal_policy – The TemporalPolicy with which to open the array.
encryption_algorithm – The EncryptionAlgorithm to set on the array.

inline Array(const Context &ctx, tiledb_array_t *carray, tiledb_config_t *config)¶

Constructor. This sets the array config.

Example:

tiledb::Context ctx;
tiledb_config_t* config;

Parameters:

ctx – TileDB context.
carray – The array.
config – The array’s config.

inline Array(const Context &ctx, tiledb_array_t *carray, bool own = true)¶

Constructor. Creates a TileDB Array instance wrapping the given pointer.

Parameters:

ctx – tiledb::Context
own=true – If false, disables underlying cleanup upon destruction.

Throws:

TileDBError – if construction fails

inline ~Array()¶: Destructor; calls close().

inline bool is_open() const¶: Checks if the array is open.

inline std::string uri() const¶: Returns the array URI.

inline const Context &context() const¶: Get the Context for the array.

inline ArraySchema schema() const¶: Get the ArraySchema for the array.

inline std::shared_ptr<tiledb_array_t> ptr() const¶: Returns a shared pointer to the C TileDB array object.

inline void open(tiledb_query_type_t query_type)¶

Opens the array. The array is opened using a query type as input.

This is to indicate that queries created for this Array object will inherit the query type. In other words, Array objects are opened to receive only one type of queries. They can always be closed and be re-opened with another query type. Also there may be many different Array objects created and opened with different query types. For instance, one may create and open an array object array_read for reads and another one array_write for writes, and interleave creation and submission of queries for both these array objects.

Example:

// Open the array for writing
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_WRITE);
// Close and open again for reading.
array.close();
array.open(TILEDB_READ);

Parameters:: query_type – The type of queries the array object will be receiving.
Throws:: TileDBError – if the array is already open or other error occurred.

inline void open(tiledb_query_type_t query_type, uint64_t timestamp)¶

Opens the array. The array is opened using a query type as input.

See Array::open

inline void open(tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const std::string &encryption_key)¶

Opens the array. The array is opened using a query type as input.

See Array::open

inline void open(tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const std::string &encryption_key, uint64_t timestamp)¶

Opens the array. The array is opened using a query type as input.

See Array::open

inline void reopen()¶

Reopens the array (the array must be already open). This is useful when the array got updated after it got opened and the Array object got created. To sync-up with the updates, the user must either close the array and open with open(), or just use reopen() without closing. This function will be generally faster than the former alternative.

Note: reopening encrypted arrays does not require the encryption key.

Example:

// Open the array for reading
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
array.reopen();

Throws:: TileDBError – if the array was not already open or other error occurred.

inline void set_open_timestamp_start(uint64_t timestamp_start) const¶: Sets the inclusive starting timestamp when opening this array.

inline void set_open_timestamp_end(uint64_t timestamp_end) const¶: Sets the inclusive ending timestamp when opening this array.

inline uint64_t open_timestamp_start() const¶: Retrieves the inclusive starting timestamp.

inline uint64_t open_timestamp_end() const¶: Retrieves the inclusive ending timestamp.

inline void set_config(const Config &config) const¶

Sets the array config.

Pre:: The array must be closed.

inline Config config() const¶: Retrieves the array config.

inline void close()¶

Closes the array. The destructor calls this automatically if the underlying pointer is owned.

Example:

tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
array.close();

template<typename T> inline std::vector<std::pair<std::string, std::pair<T, T>>> non_empty_domain()¶

Retrieves the non-empty domain from the array. This is the union of the non-empty domains of the array fragments.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
// Specify the domain type (example uint32_t)
auto non_empty = array.non_empty_domain<uint32_t>();
std::cout << "Dimension named " << non_empty[0].first << " has cells in ["
          << non_empty[0].second.first << ", " non_empty[0].second.second
          << "]" << std::endl;

Template Parameters:: T – Domain datatype
Returns:: Vector of dim names with a {lower, upper} pair. Inclusive. Empty vector if the array has no data.

template<typename T> inline std::pair<T, T> non_empty_domain(unsigned idx)¶

Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
// Specify the dimension type (example uint32_t)
auto non_empty = array.non_empty_domain<uint32_t>(0);

Template Parameters:: T – Dimension datatype
Parameters:: idx – The dimension index.
Returns:: The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.

template<typename T> inline std::pair<T, T> non_empty_domain(const std::string &name)¶

Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
// Specify the dimension type (example uint32_t)
auto non_empty = array.non_empty_domain<uint32_t>("d1");

Template Parameters:: T – Dimension datatype
Parameters:: name – The dimension name.
Returns:: The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.

inline std::pair<std::string, std::string> non_empty_domain_var(unsigned idx)¶

Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments. Applicable only to var-sized dimensions.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
// Specify the dimension type (example uint32_t)
auto non_empty = array.non_empty_domain_var(0);

Parameters:: idx – The dimension index.
Returns:: The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.

inline std::pair<std::string, std::string> non_empty_domain_var(const std::string &name)¶

Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments. Applicable only to var-sized dimensions.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
// Specify the dimension type (example uint32_t)
auto non_empty = array.non_empty_domain_var("d1");

Parameters:: name – The dimension name.
Returns:: The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.

inline tiledb_query_type_t query_type() const¶: Returns the query type the array was opened with.

inline void put_metadata(const std::string &key, tiledb_datatype_t value_type, uint32_t value_num, const void *value)¶

It puts a metadata key-value item to an open array. The array must be opened in WRITE mode, otherwise the function will error out.

Note

The writes will take effect only upon closing the array.

Parameters:

key – The key of the metadata item to be added. UTF-8 encodings are acceptable.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata.
value – The metadata value in binary form.

inline void delete_metadata(const std::string &key)¶

It deletes a metadata key-value item from an open array. The array must be opened in WRITE mode, otherwise the function will error out.

Note

The writes will take effect only upon closing the array.

Note

If the key does not exist, this will take no effect (i.e., the function will not error out).

Parameters:: key – The key of the metadata item to be deleted.

inline void get_metadata(const std::string &key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)¶

It gets a metadata key-value item from an open array. The array must be opened in READ mode, otherwise the function will error out.

Note

If the key does not exist, then value will be NULL.

Parameters:

key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.
value – The metadata value in binary form.

inline bool has_metadata(const std::string &key, tiledb_datatype_t *value_type)¶

Checks if key exists in metadata from an open array. The array must be opened in READ mode, otherwise the function will error out.

Note

If the key does not exist, then value_type will not be modified.

Parameters:

key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.
value_type – The datatype of the value associated with the key (if any).

Returns:

true if the key exists, else false.

inline uint64_t metadata_num() const¶: Returns then number of metadata items in an open array. The array must be opened in READ mode, otherwise the function will error out.

inline void get_metadata_from_index(uint64_t index, std::string *key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)¶

It gets a metadata item from an open array using an index. The array must be opened in READ mode, otherwise the function will error out.

Parameters:

index – The index used to get the metadata.
key – The metadata key.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.
value – The metadata value in binary form.

Public Static Functions

static inline void delete_array(const Context &ctx, const std::string &uri)¶

Deletes all data written to the array with the input uri.

Parameters:

ctx – TileDB context
uri – The Array’s URI

Post:

This is destructive; the array may not be reopened after delete.

static inline void delete_fragments(const Context &ctx, const std::string &uri, uint64_t timestamp_start, uint64_t timestamp_end)¶

Deletes the fragments written between the input timestamps of an array with the input uri.

Parameters:

ctx – TileDB context
uri – The URI of the fragments’ parent Array.
timestamp_start – The epoch start timestamp in milliseconds.
timestamp_end – The epoch end timestamp in milliseconds. Use UINT64_MAX for the current timestamp.

static inline void delete_fragments_list(const Context &ctx, const std::string &uri, const char *fragment_uris[], const size_t num_fragments)¶: Deletes the fragments with the input uris on an array with the input uri.

static inline void consolidate(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶

Consolidates the fragments of an array into a single fragment.

You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).

Example:

tiledb::Array::consolidate(ctx, "s3://bucket-name/array-name");

Parameters:

ctx – TileDB context
array_uri – The URI of the TileDB array to be consolidated.
config – Configuration parameters for the consolidation.

static inline void consolidate(const Context &ctx, const std::string &array_uri, const char *fragment_uris[], const size_t num_fragments, Config *const config = nullptr)¶

Consolidates the fragments with the input uris into a single fragment.

You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).

Example:

const char* fragment_uris[2] = {
"__1712657401931_1712657401931_285cf8a0eff4df875a04cfbea96d5c00_21",
"__1712657401948_1712657401948_285cf8a0efdsafas6a5a04cfbesajads_21"};

tiledb::Array::consolidate(
    ctx,
    "s3://bucket-name/array-name",
     fragment_uris,
     2,
     config);

Parameters:

ctx – TileDB context
array_uri – The URI of the TileDB array to be consolidated.
fragment_uris – Fragment names of the fragments to consolidate. The names can be recovered using tiledb_fragment_info_get_fragment_name_v2.
num_fragments – The number of fragments to consolidate.
config – Configuration parameters for the consolidation.

static inline void vacuum(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶

Cleans up the array, such as consolidated fragments and array metadata. Note that this will coarsen the granularity of time traveling (see docs for more information).

Example:

tiledb::Array::vacuum(ctx, "s3://bucket-name/array-name");

Parameters:

ctx – TileDB context
array_uri – The URI of the TileDB array to be vacuumed.
config – Configuration parameters for the vacuuming.

static inline void create(const Context &ctx, const std::string &uri, const ArraySchema &schema)¶

Creates a new TileDB array given an input schema.

Example:

tiledb::Array::create(ctx, "s3://bucket-name/array-name", schema);

Parameters:

ctx – The TileDB context.
uri – URI where array will be created.
schema – The array schema.

static inline void create(const std::string &uri, const ArraySchema &schema)¶

Creates a new TileDB array given an input schema.

To create the array, this function uses the context that was used to instantiate the schema. You are recommended to explicitly pass it with the overload that takes a context.

Example:

tiledb::Array::create("s3://bucket-name/array-name", schema);

Parameters:

uri – URI where array will be created.
schema – The array schema.

static inline ArraySchema load_schema(const Context &ctx, const std::string &uri)¶

Loads the array schema from an array.

Example:

auto schema = tiledb::Array::load_schema(ctx,
"s3://bucket-name/array-name");

Parameters:

ctx – The TileDB context.
uri – The array URI.

Returns:

The loaded ArraySchema object.

static inline ArraySchema load_schema_with_config(const Context &ctx, const Config &config, const std::string &uri)¶

Loads the array schema from an array. Options to load additional features are read from the optionally-provided config. See tiledb_array_schema_load_with_config.

Example:

tiledb::Config config;
config["rest.load_enumerations_on_array_open"] = "true";
auto schema = tiledb::Array::load_schema_with_config(ctx, config,
"s3://bucket-name/array-name");

Parameters:

ctx – The TileDB context.
config – The request for additional features.
uri – The array URI.

Returns:

The loaded ArraySchema object.

static inline tiledb_encryption_type_t encryption_type(const Context &ctx, const std::string &array_uri)¶

Gets the encryption type the given array was created with.

Example:

tiledb_encryption_type_t enc_type;
tiledb::Array::encryption_type(ctx, "s3://bucket-name/array-name",
   &enc_type);

Parameters:

ctx – TileDB context
array_uri – The URI of the TileDB array to be consolidated.
encryption_type – Set to the encryption type of the array.

static inline void consolidate_metadata(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶

Consolidates the metadata of an array.

You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).

Example:

tiledb::Array::consolidate_metadata(ctx, "s3://bucket-name/array-name");

Parameters:

ctx – TileDB context
array_uri – The URI of the TileDB array whose metadata will be consolidated.
config – Configuration parameters for the consolidation.

static inline void upgrade_version(const Context &ctx, const std::string &array_uri, Config *const config = nullptr)¶

Upgrades an array to the latest format version.

Example:

tiledb::Array::upgrade_version(ctx, "array_name");

Parameters:

ctx – TileDB context
array_uri – The URI of the TileDB array to be upgraded.
config – Configuration parameters for the upgrade.

Query¶

class Query¶

Construct and execute read/write queries on a tiledb::Array.

See examples for more usage details.

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::Array array(ctx, "my_dense_array", TILEDB_WRITE);
Query query(ctx, array);
query.set_layout(TILEDB_GLOBAL_ORDER);
std::vector a1_data = {1, 2, 3};
query.set_data_buffer("a1", a1_data);
query.submit();
query.finalize();
array.close();

Public Types

enum class Status¶

The query or query attribute status.

Values:

enumerator FAILED¶: Query failed.

enumerator COMPLETE¶: Query completed (all data has been read)

enumerator INPROGRESS¶: Query is in progress

enumerator INCOMPLETE¶: Query completed (but not all data has been read)

enumerator UNINITIALIZED¶: Query not initialized.

enumerator INITIALIZED¶: Query initialized (strategy created).

Public Functions

inline Query(const Context &ctx, const Array &array, tiledb_query_type_t type)¶

Creates a TileDB query object.

The query type (read or write) must be the same as the type used to open the array object.

The storage manager also acquires a shared lock on the array. This means multiple read and write queries to the same array can be made concurrently (in TileDB, only consolidation requires an exclusive lock for a short period of time).

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::Array array(ctx, "my_array", TILEDB_WRITE);
tiledb::Query query(ctx, array, TILEDB_WRITE);

Parameters:

ctx – TileDB context
array – Open Array object
type – The TileDB query type

inline Query(const Context &ctx, const Array &array)¶

Creates a TileDB query object.

The query type (read or write) is inferred from the array object, which was opened with a specific query type.

The storage manager also acquires a shared lock on the array. This means multiple read and write queries to the same array can be made concurrently (in TileDB, only consolidation requires an exclusive lock for a short period of time).

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::Array array(ctx, "my_array", TILEDB_WRITE);
Query query(ctx, array);
// Equivalent to:
// Query query(ctx, array, TILEDB_WRITE);

Parameters:

ctx – TileDB context
array – Open Array object

inline Query(const Array &array)¶

Creates a TileDB query object.

The context and query type (read or write) are inferred from the array object, which was opened with a specific query type.

The storage manager also acquires a shared lock on the array. This means multiple read and write queries to the same array can be made concurrently (in TileDB, only consolidation requires an exclusive lock for a short period of time).

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::Array array(ctx, "my_array", TILEDB_WRITE);
Query query(array);
// Equivalent to:
// Query query(ctx, array, TILEDB_WRITE);

Parameters:: array – Open Array object

inline std::shared_ptr<tiledb_query_t> ptr() const¶: Returns a shared pointer to the C TileDB query object.

inline tiledb_query_type_t query_type() const¶: Returns the query type (read or write).

inline Query &set_layout(tiledb_layout_t layout)¶

Sets the layout of the cells to be written or read.

Parameters:

layout – For a write query, this specifies the order of the cells provided by the user in the buffers. For a read query, this specifies the order of the cells that will be retrieved as results and stored in the user buffers. The layout can be one of the following:

TILEDB_COL_MAJOR: This means column-major order with respect to the subarray.
TILEDB_ROW_MAJOR: This means row-major order with respect to the subarray.
TILEDB_GLOBAL_ORDER: This means that cells are stored or retrieved in the array global cell order.
TILEDB_UNORDERED: This is applicable only to writes for sparse arrays, or for sparse writes to dense arrays. It specifies that the cells are unordered and, hence, TileDB must sort the cells in the global cell order prior to writing.

Returns:

Reference to this Query

inline tiledb_layout_t query_layout() const¶: Returns the layout of the query.

inline Query &set_condition(const QueryCondition &condition)¶

Sets the read query condition.

Note that only one query condition may be set on a query at a time. This overwrites any previously set query condition. To apply more than one condition at a time, use the QueryCondition::combine API to construct a single object.

Parameters:: condition – The query condition object.
Returns:: Reference to this Query

inline const Array &array()¶: Returns the array of the query.

inline Status query_status() const¶: Returns the query status.

inline bool has_results() const¶: Returns true if the query has results. Applicable only to read queries (it returns false for write queries).

inline Status submit()¶

Submits the query. Call will block until query is complete.

Note

finalize() must be invoked after finish writing in global layout (via repeated invocations of submit()), in order to flush any internal state. For the case of reads, if the returned status is TILEDB_INCOMPLETE, TileDB could not fit the entire result in the user’s buffers. In this case, the user should consume the read results (if any), optionally reset the buffers with set_data_buffer(), and then resubmit the query until the status becomes TILEDB_COMPLETED. If all buffer sizes after the termination of this function become 0, then this means that no useful data was read into the buffers, implying that the larger buffers are needed for the query to proceed. In this case, the users must reallocate their buffers (increasing their size), reset the buffers with set_data_buffer(), and resubmit the query.

Returns:: Query status

inline void finalize()¶: Flushes all internal state of a query object and finalizes the query. This is applicable only to global layout writes. It has no effect for any other query type.

inline void submit_and_finalize()¶: Submits and finalizes the last tile of a global order write. For remote TileDB arrays, this is optimized to use only one request to perform both the submit and finalize.

inline std::unordered_map<std::string, std::pair<uint64_t, uint64_t>> result_buffer_elements() const¶

Returns the number of elements in the result buffers from a read query. This is a map from the attribute name to a pair of values.

The first is number of elements (offsets) for var size attributes, and the second is number of elements in the data buffer. For fixed sized attributes (and coordinates), the first is always 0.

For variable sized attributes: the first value is the number of cells read, i.e. the number of offsets read for the attribute. The second value is the total number of elements in the data buffer. For example, a read query on a variable-length float attribute that reads three cells would return 3 for the first number in the pair. If the total amount of floats read across the three cells was 10, then the second number in the pair would be 10.

For fixed-length attributes, the first value is always 0. The second value is the total number of elements in the data buffer. For example, a read query on a single float attribute that reads three cells would return 3 for the second value. A read query on a float attribute with cell_val_num 2 that reads three cells would return 3 * 2 = 6 for the second value.

If the query has not been submitted, an empty map is returned.

Example:

// Submit a read query.
query.submit();
auto result_el = query.result_buffer_elements();

// For fixed-sized attributes, `.second` is the number of elements
// that were read for the attribute across all cells. Note: number of
// elements and not number of bytes.
auto num_a1_elements = result_el["a1"].second;

// Coords are also fixed-sized.
auto num_coords = result_el["__coords"].second;

// In variable attributes, e.g. std::string type, need two buffers,
// one for offsets and one for cell data ("elements").
auto num_a2_offsets = result_el["a2"].first;
auto num_a2_elements = result_el["a2"].second;

inline std::unordered_map<std::string, std::tuple<uint64_t, uint64_t, uint64_t>> result_buffer_elements_nullable() const¶

Returns the number of elements in the result buffers from a read query. This is a map from the attribute name to a tuple of values.

The first is number of elements (offsets) for var size attributes, and the second is number of elements in the data buffer. For fixed sized attributes (and coordinates), the first is always 0. The third element is the size of the validity bytemap buffer.

For variable sized attributes: the first value is the number of cells read, i.e. the number of offsets read for the attribute. The second value is the total number of elements in the data buffer. For example, a read query on a variable-length float attribute that reads three cells would return 3 for the first number in the pair. If the total amount of floats read across the three cells was 10, then the second number in the pair would be 10.

For fixed-length attributes, the first value is always 0. The second value is the total number of elements in the data buffer. For example, a read query on a single float attribute that reads three cells would return 3 for the second value. A read query on a float attribute with cell_val_num 2 that reads three cells would return 3 * 2 = 6 for the second value.

If the query has not been submitted, an empty map is returned.

Example:

// Submit a read query.
query.submit();
auto result_el = query.result_buffer_elements_nullable();

// For fixed-sized attributes, the second tuple element is the number of
// elements that were read for the attribute across all cells. Note: number
// of elements and not number of bytes.
auto num_a1_elements = std::get<1>(result_el["a1"]);

// In variable attributes, e.g. std::string type, need two buffers,
// one for offsets and one for cell data ("elements").
auto num_a2_offsets = std::get<0>(result_el["a2"]);
auto num_a2_elements = std::get<1>(result_el["a2"]);

// For both fixed-size and variable-sized attributes, the third tuple
// element is the number of elements in the validity bytemap.
auto num_a1_validity_values = std::get<2>(result_el["a1"]);
auto num_a2_validity_values = std::get<2>(result_el["a2"]);

inline uint64_t est_result_size(const std::string &attr_name) const¶

Retrieves the estimated result size for a fixed-size attribute. This is an estimate and may not be sufficient to read all results for the requested range, for sparse arrays or array with var-length attributes. Query status must be checked and resubmitted if not complete.

Example:

uint64_t est_size = query.est_result_size("attr1");

Parameters:: attr_name – The attribute name.
Returns:: The estimated size in bytes.

inline std::array<uint64_t, 2> est_result_size_var(const std::string &attr_name) const¶

Retrieves the estimated result size for a variable-size attribute. This is an estimate and may not be sufficient to read all results for the requested ranges, for sparse arrays or any array with var-length attributes. Query status must be checked and resubmitted if not complete.

Example:

std::array<uint64_t, 2> est_size =
    query.est_result_size_var("attr1");

Parameters:: attr_name – The attribute name.
Returns:: An array with first element containing the estimated size of the result offsets in bytes, and second element containing the estimated size of the result values in bytes.

inline std::array<uint64_t, 2> est_result_size_nullable(const std::string &attr_name) const¶

Retrieves the estimated result size for a fixed-size, nullable attribute. This is an estimate and may not be sufficient to read all results for the requested ranges, for sparse arrays or any array with var-length attributes. Query status must be checked and resubmitted if not complete.

Example:

std::array<uint64_t, 2> est_size =
   query.est_result_size_nullable("attr1");

Parameters:: attr_name – The attribute name.
Returns:: An array with first element containing the estimated size of the result values in bytes, and second element containing the estimated size of the result validity values in bytes.

inline std::array<uint64_t, 3> est_result_size_var_nullable(const std::string &attr_name) const¶

Retrieves the estimated result size for a variable-size, nullable attribute.

Example:

std::array<uint64_t, 3> est_size =
    query.est_result_size_var_nullable("attr1");

Parameters:: attr_name – The attribute name.
Returns:: An array with first element containing the estimated size of the offset values in bytes, second element containing the estimated size of the result values in bytes, and the third element containing the estimated size of the validity values in bytes.

inline uint32_t fragment_num() const¶: Returns the number of written fragments. Applicable only to WRITE queries.

inline std::string fragment_uri(uint32_t idx) const¶: Returns the URI of the written fragment with the input index. Applicable only to WRITE queries.

inline std::pair<uint64_t, uint64_t> fragment_timestamp_range(uint32_t idx) const¶: Returns the timestamp range of the written fragment with the input index. Applicable only to WRITE queries.

inline Query &set_subarray(const Subarray &subarray)¶

Prepare a query with the contents of a subarray.

Parameters:: subarray – The subarray to be used to prepare the query.

inline Query &set_config(const Config &config)¶

Set the query config.

Setting the query config will also set the subarray configuration in order to maintain existing behavior. If you wish the subarray to have a different configuration than the query, set it after calling Query::set_config.

Setting configuration with this function overrides the following Query-level parameters only:

sm.memory_budget
sm.memory_budget_var
sm.var_offsets.mode
sm.var_offsets.extra_element
sm.var_offsets.bitsize
sm.check_coord_dups
sm.check_coord_oob
sm.check_global_order
sm.dedup_coords

inline Config config() const¶

Get the config

Returns:: Config

template<typename T> inline Query &set_data_buffer(const std::string &name, T *buff, uint64_t nelements)¶

Sets the data for a fixed/var-sized attribute/dimension.

The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds values to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain data read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_WRITE);
int data_a1[] = {0, 1, 2, 3};
Query query(ctx, array);
query.set_data_buffer("a1", data_a1, 4);

Note

set_data_buffer(std::string, std::vector) is preferred as it is safer.

Template Parameters:

T – Attribute/Dimension value type

Parameters:

name – Attribute/Dimension name
buff – Buffer array pointer with elements of the attribute/dimension type.
nelements – Number of array elements

template<typename T> inline Query &set_data_buffer(const std::string &name, std::vector<T> &buf)¶

Sets the data for a fixed/var-sized attribute/dimension.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_WRITE);
std::vector<int> data_a1 = {0, 1, 2, 3};
Query query(ctx, array);
query.set_data_buffer("a1", data_a1);

Template Parameters:

T – Attribute/Dimension value type

Parameters:

name – Attribute/Dimension name
buf – Buffer vector with elements of the attribute/dimension type.

inline Query &set_data_buffer(const std::string &name, void *buff, uint64_t nelements)¶

Sets the data for a fixed/var-sized attribute/dimension.

The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds values to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain data read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.

Note

This unsafe version does not perform type checking; the given buffer is assumed to be the correct type, and the size of an element in the given buffer is assumed to be the size of the datatype of the attribute.

Parameters:

name – Attribute/Dimension name
buff – Buffer array pointer with elements of the attribute type.
nelements – Number of array elements in buffer

inline Query &set_data_buffer(const std::string &name, std::string &data)¶

Sets the data for a fixed/var-sized attribute/dimension.

Parameters:

name – Attribute/Dimension name
data – Pre-allocated string buffer.

inline Query &set_offsets_buffer(const std::string &attr, uint64_t *offsets, uint64_t offset_nelements)¶

Sets the offset buffer for a var-sized attribute/dimension.

The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds offsets to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain offset data read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_WRITE);
uint64_t offsets_a1[] = {0, 8};
Query query(ctx, array);
query.set_offsets_buffer("a1", offsets_a1, 2);

Note

set_offsets_buffer(std::string, std::vector, std::vector) is preferred as it is safer.

Parameters:

attr – Attribute/Dimension name
offsets – Offsets array pointer where a new element begins in the data buffer.
offsets_nelements – Number of elements in offsets buffer.

inline Query &set_offsets_buffer(const std::string &name, std::vector<uint64_t> &offsets)¶

Sets the offset buffer for a var-sized attribute/dimension.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_WRITE);
std::vector<uint64_t> offsets_a1 = {0, 8};
Query query(ctx, array);
query.set_offsets_buffer("a1", offsets_a1);

Parameters:

name – Attribute/Dimension name
offsets – Offsets where a new element begins in the data buffer.

inline Query &set_validity_buffer(const std::string &attr, uint8_t *validity_bytemap, uint64_t validity_bytemap_nelements)¶

Sets the validity buffer for nullable attribute/dimension.

The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds validity values to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain the validity map read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.

Template Parameters:

T – Attribute value type

Parameters:

attr – Attribute name
validity_bytemap – The validity bytemap buffer.
validity_bytemap_nelements – The number of values within validity_bytemap_nelements

inline Query &set_validity_buffer(const std::string &name, std::vector<uint8_t> &validity_bytemap)¶

Sets the validity buffer for nullable attribute/dimension.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_WRITE);
std::vector<uint8_t> validity_bytemap = {1, 1, 0, 1};
Query query(ctx, array);
query.set_validity_buffer("a1", validity_bytemap);

Parameters:

name – Attribute name
validity_bytemap – Buffer vector with elements of the attribute validity values.

inline Query &get_data_buffer(const std::string &name, void **data, uint64_t *data_nelements, uint64_t *element_size)¶

Retrieves the data buffer of a fixed/var-sized attribute/dimension.

Parameters:

name – Attribute/dimension name
data – Buffer array pointer with elements of the attribute type.
data_nelements – Number of array elements.
element_size – Size of array elements (in bytes).

inline Query &get_offsets_buffer(const std::string &name, uint64_t **offsets, uint64_t *offsets_nelements)¶

Retrieves the offset buffer for a var-sized attribute/dimension.

Parameters:

name – Attribute/dimension name
offsets – Offsets array pointer with elements of uint64_t type.
offsets_nelements – Number of array elements.

inline Query &get_validity_buffer(const std::string &name, uint8_t **validity_bytemap, uint64_t *validity_bytemap_nelements)¶

Retrieves the validity buffer for a nullable attribute/dimension.

Parameters:

name – Attribute name
validity_bytemap – Buffer array pointer with elements of the attribute validity values.
validity_bytemap_nelements – Number of validity bytemap elements.

inline std::string stats()¶: Returns a JSON-formatted string of the stats.

inline Query &update_subarray_from_query(Subarray *subarray)¶

Update the subarray data within the query from the subarray parameter.

Parameters:: subarray – The output subarray to receive this query’s subarray data.

Public Static Functions

static inline Status to_status(const tiledb_query_status_t &status)¶: Converts the TileDB C query status to a C++ query status.

static inline std::string to_str(tiledb_query_type_t type)¶: Converts the TileDB C query type to a string representation.

QueryCondition¶

class QueryCondition¶

Public Functions

inline QueryCondition(const Context &ctx)¶

Creates a TileDB query condition object.

Parameters:: ctx – TileDB context.

QueryCondition(const QueryCondition&) = default¶: Copy constructor.

QueryCondition(QueryCondition&&) = default¶: Move constructor.

~QueryCondition() = default¶: Destructor.

inline QueryCondition(const Context &ctx, tiledb_query_condition_t *const qc)¶

Constructs an instance directly from a C-API query condition object.

Parameters:

ctx – The TileDB context.
qc – The C-API query condition object.

QueryCondition &operator=(const QueryCondition&) = default¶: Copy-assignment operator.

QueryCondition &operator=(QueryCondition&&) = default¶: Move-assignment operator.

inline void init(const std::string &attribute_name, const void *condition_value, uint64_t condition_value_size, tiledb_query_condition_op_t op)¶

Initialize a TileDB query condition object.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "my_array", TILEDB_READ);
tiledb::Query query(ctx, array, TILEDB_READ);

int cmp_value = 5;
tiledb::QueryCondition qc;
qc.init("a1", &cmp_value, sizeof(int), TILEDB_LT);
query.set_condition(qc);

Parameters:

ctx – TileDB context.
attribute_name – The name of the attribute to compare against.
condition_value – The fixed value to compare against.
condition_value_size – The byte size of condition_value.
op – The comparison operation between each cell value and condition_value.

inline void init(const std::string &attribute_name, const std::string &condition_value, tiledb_query_condition_op_t op)¶

Initializes a TileDB query condition object.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "my_array", TILEDB_READ);
tiledb::Query query(ctx, array, TILEDB_READ);

std::string cmp_value = "abc";
tiledb::QueryCondition qc;
qc.init("a1", cmp_value, TILEDB_LT);
query.set_condition(qc);

Parameters:

ctx – TileDB context.
attribute_name – The name of the attribute to compare against.
condition_value – The fixed value to compare against.
condition_value_size – The byte size of condition_value.
op – The comparison operation between each cell value and condition_value.

inline std::shared_ptr<tiledb_query_condition_t> ptr() const¶: Returns a shared pointer to the C TileDB query condition object.

inline QueryCondition combine(const QueryCondition &rhs, tiledb_query_condition_combination_op_t combination_op) const¶

Combines this instance with another instance to form a multi-clause condition object.

Example:

int qc1_cmp_value = 10;
tiledb::QueryCondition qc1;
qc1.init("a1", &qc1_cmp_value, sizeof(int), TILEDB_LT);
int qc2_cmp_value = 3;
tiledb::QueryCondition qc2;
qc.init("a1", &qc2_cmp_value, sizeof(int), TILEDB_GE);

tiledb::QueryCondition qc3 = qc1.combine(qc2, TILEDB_AND);
query.set_condition(qc3);

Parameters:

rhs – The right-hand-side query condition object.
combination_op – The logical combination operator that combines this instance with rhs.

inline QueryCondition negate() const¶

Return a query condition representing a negation of this query condition. Currently this is performed by applying De Morgan’s theorem recursively to the query condition’s internal representation.

Example:

int qc1_cmp_value = 10;
tiledb::QueryCondition qc1;
qc1.init("a1", &qc1_cmp_value, sizeof(int), TILEDB_LT);
tiledb::QueryCondition qc2 = qc1.negate();
query.set_condition(qc2);

Public Static Functions

static inline QueryCondition create(const Context &ctx, const std::string &attribute_name, const std::string &value, tiledb_query_condition_op_t op)¶

Factory function for creating a new query condition with a string datatype.

Example:

tiledb::Context ctx;
auto a1 = tiledb::QueryCondition::create(ctx, "a1", "foo", TILEDB_LE);

Template Parameters:

T – Datatype of the attribute. Can either be arithmetic type or string.

Parameters:

ctx – The TileDB context.
name – The attribute name.
value – The value to compare against.
op – The comparison operator.

Returns:

A new QueryCondition object.

template<typename T> static inline QueryCondition create(const Context &ctx, const std::string &attribute_name, T value, tiledb_query_condition_op_t op)¶

Factory function for creating a new query condition with datatype T.

Example:

tiledb::Context ctx;
auto a1 = tiledb::QueryCondition::create<int>(ctx, "a1", 5, TILEDB_LE);
auto a2 = tiledb::QueryCondition::create<float>(ctx, "a3", 3.5,
  TILEDB_GT);
auto a3 = tiledb::QueryCondition::create<double>(ctx,
  "a4", 10.0, TILEDB_LT);

Template Parameters:

T – Datatype of the attribute. Can either be arithmetic type or string.

Parameters:

ctx – The TileDB context.
name – The attribute name.
value – The value to compare against.
op – The comparison operator.

Returns:

A new QueryCondition object.

Subarray¶

class Subarray¶

Construct and support manipulation of a possibly multiple-range subarray for optional use with Query object operations.

See examples for more usage details.

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::Array array(ctx, "my_dense_array", TILEDB_WRITE);
Query query(ctx, array);
std::vector a1_data = {1, 2, 3};
query.set_buffer("a1", a1_data);
tiledb::Subarray subarray(ctx, array);
subarray.set_layout(TILEDB_GLOBAL_ORDER);
std::vector<int32_t> subarray_indices = {1, 2};
subarray.add_range(0, subarray_indices[0], subarray_indices[1]);
query.set_subarray(subarray);
query.submit();
query.finalize();
array.close();

Public Functions

inline Subarray(const tiledb::Context &ctx, const tiledb::Array &array, bool coalesce_ranges = true)¶

Creates a TileDB Subarray object.

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::Array array(ctx, "my_array", TILEDB_WRITE);
tiledb::Subarray subarray(ctx, array);

Parameters:

ctx – TileDB context
array – Open Array object
coalesce_ranges – When enabled, ranges will attempt to coalesce with existing ranges as they are added.

inline Subarray &set_coalesce_ranges(bool coalesce_ranges)¶: Set the coalesce_ranges flag for the subarray.

inline Subarray &replace_subarray_data(tiledb_subarray_t *capi_subarray)¶

Replace/update -this- Subarray’s shared_ptr to data to reference the passed subarray.

Parameters:: capi_subarray – is a c_api subarray to be referenced by this cpp_api subarray entity.

template<class T> inline Subarray &add_range(uint32_t dim_idx, T start, T end, T stride = 0)¶

Adds a 1D range along a subarray dimension index, in the form (start, end, stride). The datatype of the range must be the same as the dimension datatype.

Example:

// Set a 1D range on dimension 0, assuming the domain type is int64.
int64_t start = 10;
int64_t end = 20;
// Stride is optional
subarray.add_range(0, start, end);

Template Parameters:

T – The dimension datatype.

Parameters:

dim_idx – The index of the dimension to add the range to.
start – The range start to add.
end – The range end to add.
stride – The range stride to add.

Returns:

Reference to this Subarray.

template<class T> inline Subarray &add_range(const std::string &dim_name, T start, T end, T stride = 0)¶

Adds a 1D range along a subarray dimension name, specified by its name, in the form (start, end, stride). The datatype of the range must be the same as the dimension datatype.

Example:

// Set a 1D range on dimension "rows", assuming the domain type is int64.
int64_t start = 10;
int64_t end = 20;
const std::string dim_name = "rows";
// Stride is optional
subarray.add_range(dim_name, start, end);

Template Parameters:

T – The dimension datatype.

Parameters:

dim_name – The name of the dimension to add the range to.
start – The range start to add.
end – The range end to add.
stride – The range stride to add.

Returns:

Reference to this Subarray.

inline Subarray &add_range(uint32_t dim_idx, const std::string &start, const std::string &end)¶

Adds a 1D string range along a subarray dimension index, in the form (start, end). Applicable only to variable-sized dimensions

Example:

// Set a 1D range on dimension 0, assuming the domain type is int64.
int64_t start = 10;
int64_t end = 20;
// Stride is optional
subarray.add_range(0, start, end);

Template Parameters:

T – The dimension datatype.

Parameters:

dim_idx – The index of the dimension to add the range to.
start – The range start to add.
end – The range end to add.

Returns:

Reference to this Subarray.

inline Subarray &add_range(const std::string &dim_name, const std::string &start, const std::string &end)¶

Adds a 1D string range along a subarray dimension name, in the form (start, end). Applicable only to variable-sized dimensions

Example:

// Set a 1D range on dimension "rows", assuming the domain type is int64.
int64_t start = 10;
int64_t end = 20;
const std::string dim_name = "rows";
// Stride is optional
subarray.add_range(dim_name, start, end);

Template Parameters:

T – The dimension datatype.

Parameters:

dim_name – The name of the dimension to add the range to.
start – The range start to add.
end – The range end to add.

Returns:

Reference to this Subarray.

template<typename T = uint64_t> inline Subarray &set_subarray(const T *pairs, uint64_t size)¶

Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_READ);
int subarray_vals[] = {0, 3, 0, 3};
Subarray subarray(ctx, array);
subarray.set_subarray(subarray_vals, 4);

Note

set_subarray(std::vector<T>) is preferred as it is safer.

Note

The number of pairs passed should equal number of dimensions of the array associated with the subarray, or the number of elements in subarray_vals should equal that number of dimensions * 2.

Template Parameters:

T – Type of array domain.

Parameters:

pairs – Subarray pointer defined as an array of [start, stop] values per dimension.
size – The number of subarray elements.

inline Subarray &set_config(const Config &config)¶

Set the subarray config.

Setting configuration with this function overrides the following Subarray-level parameters only:

sm.read_range_oob

template<typename Vec> inline Subarray &set_subarray(const Vec &pairs)¶

Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_READ);
std::vector<int> subarray_vals = {0, 3, 0, 3};
Subarray subarray(ctx, array);
subarray.set_subarray(subarray_vals);

Template Parameters:: Vec – Vector datatype. Should always be a vector of the domain type.
Parameters:: pairs – The subarray defined as a vector of [start, stop] coordinates per dimension.

template<typename T = uint64_t> inline Subarray &set_subarray(const std::initializer_list<T> &l)¶

Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_READ);
Subarray subarray(ctx, array);
subarray.set_subarray({0, 3, 0, 3});

Template Parameters:: T – Type of array domain.
Parameters:: pairs – List of [start, stop] coordinates per dimension.

template<typename T = uint64_t> inline Subarray &set_subarray(const std::vector<std::array<T, 2>> &pairs)¶

Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive.

Note

set_subarray(std::vector) is preferred and avoids an extra copy.

Template Parameters:: T – Type of array domain.
Parameters:: pairs – The subarray defined as pairs of [start, stop] per dimension.

inline uint64_t range_num(unsigned dim_idx) const¶

Retrieves the number of ranges for a given dimension index.

Example:

unsigned dim_idx = 0;
uint64_t range_num = subarray.range_num(dim_idx);

Parameters:: dim_idx – The dimension index.
Returns:: The number of ranges.

inline uint64_t range_num(const std::string &dim_name) const¶

Retrieves the number of ranges for a given dimension name.

Example:

unsigned dim_name = "rows";
uint64_t range_num = subarray.range_num(dim_name);

Parameters:: dim_name – The dimension name.
Returns:: The number of ranges.

template<class T> inline std::array<T, 3> range(unsigned dim_idx, uint64_t range_idx)¶

Retrieves a range for a given dimension index and range id. The template datatype must be the same as that of the underlying array.

Example:

unsigned dim_idx = 0;
unsigned range_idx = 0;
auto range = subarray.range<int32_t>(dim_idx, range_idx);

Template Parameters:

T – The dimension datatype.

Parameters:

dim_idx – The dimension index.
range_idx – The range index.

Returns:

A triplet of the form (start, end, stride).

template<class T> inline std::array<T, 3> range(const std::string &dim_name, uint64_t range_idx)¶

Retrieves a range for a given dimension name and range id. The template datatype must be the same as that of the underlying array.

Example:

unsigned dim_name = "rows";
unsigned range_idx = 0;
auto range = subarray.range<int32_t>(dim_name, range_idx);

Template Parameters:

T – The dimension datatype.

Parameters:

dim_name – The dimension name.
range_idx – The range index.

Returns:

A triplet of the form (start, end, stride).

inline std::array<std::string, 2> range(unsigned dim_idx, uint64_t range_idx)¶

Retrieves a range for a given variable length string dimension index and range id.

Example:

unsigned dim_idx = 0;
unsigned range_idx = 0;
std::array<std::string, 2> range = subarray.range(dim_idx, range_idx);

Parameters:

dim_idx – The dimension index.
range_idx – The range index.

Returns:

A pair of the form (start, end).

inline std::array<std::string, 2> range(const std::string &dim_name, uint64_t range_idx)¶

Retrieves a range for a given variable length string dimension name and range id.

Example:

unsigned dim_name = "rows";
unsigned range_idx = 0;
std::array<std::string, 2> range = subarray.range(dim_name, range_idx);

Parameters:

dim_name – The dimension name.
range_idx – The range index.

Returns:

A pair of the form (start, end).

inline std::shared_ptr<tiledb_subarray_t> ptr() const¶: Returns the C TileDB subarray object.

inline const Array &array() const¶: Returns the array the subarray is associated with.

Filter¶

class Filter¶

Represents a filter. A filter is used to transform attribute data e.g. with compression, delta encoding, etc.

Example:

tiledb::Context ctx;
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
int level = 5;
f.set_option(TILEDB_COMPRESSION_LEVEL, &level);

Public Functions

inline Filter(const Context &ctx, tiledb_filter_type_t filter_type)¶

Creates a Filter of the given type.

Example:

tiledb::Context ctx;
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);

Parameters:

ctx – TileDB context
filter_type – Enumerated type of filter

inline Filter(const Context &ctx, tiledb_filter_t *filter)¶

Creates a Filter with the input C object.

Parameters:

ctx – TileDB context
filter – C API filter object

inline std::shared_ptr<tiledb_filter_t> ptr() const¶: Returns a shared pointer to the C TileDB domain object.

template<typename T, typename std::enable_if_t<!std::is_pointer_v<T>, int> = 0> inline Filter &set_option(tiledb_filter_option_t option, T value)¶

Sets an option on the filter. Options are filter dependent; this function throws an error if the given option is not valid for the given filter.

Example:

tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
f.set_option(TILEDB_COMPRESSION_LEVEL, 5);

Template Parameters:

T – Type of value of option to set.

Parameters:

option – Enumerated option to set.
value – Value of option to set.

Throws:

TileDBError – if the option cannot be set on the filter.
std::invalid_argument – if the option value is the wrong type.

Returns:

Reference to this Filter

inline Filter &set_option(tiledb_filter_option_t option, const void *value)¶

Sets an option on the filter. Options are filter dependent; this function throws an error if the given option is not valid for the given filter.

This version of set_option performs no type checks.

Example:

tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
int level = 5;
f.set_option(TILEDB_COMPRESSION_LEVEL, &level);

Note

set_option<T>(option, T value) is preferred as it is safer.

Parameters:

option – Enumerated option to set.
value – Value of option to set.

Throws:

TileDBError – if the option cannot be set on the filter.

Returns:

Reference to this Filter

template<typename T> inline T get_option(tiledb_filter_option_t option)¶

Gets an option value from the filter.

Example:

tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
int32_t level = f.get_option(TILEDB_COMPRESSION_LEVEL);
// level == -1 (the default compression level)

Template Parameters:

T – Type of option value to get.

Parameters:

option – Enumerated option to get.

Throws:

TileDBError – if the option cannot be retrieved from the filter.
std::invalid_argument – if the option value is the wrong type.

Returns:

value Buffer that option value will be written to.

template<typename T, typename std::enable_if<std::is_arithmetic_v<T>>::type* = nullptr> inline void get_option(tiledb_filter_option_t option, T *value)¶

Gets an option value from the filter.

Example:

tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
int32_t level;
f.get_option(TILEDB_COMPRESSION_LEVEL, &level);
// level == -1 (the default compression level)

Template Parameters:

T – Type of option value to get.

Parameters:

option – Enumerated option to get.
value – Buffer that option value will be written to.

Throws:

TileDBError – if the option cannot be retrieved from the filter.
std::invalid_argument – if the option value is the wrong type.

inline void get_option(tiledb_filter_option_t option, void *value)¶

Gets an option value from the filter.

This version of get_option performs no type checks.

Example:

tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
int32_t level;
f.get_option(TILEDB_COMPRESSION_LEVEL, &level);
// level == -1 (the default compression level)

Note

The buffer pointed to by value must be large enough to hold the option value.

Note

T value = get_option<T>(option) is preferred as it is safer.

Parameters:

option – Enumerated option to get.
value – Buffer that option value will be written to.

Throws:

TileDBError – if the option cannot be retrieved from the filter.

inline tiledb_filter_type_t filter_type() const¶: Gets the filter type of this filter.

Public Static Functions

static inline std::string to_str(tiledb_filter_type_t type)¶: Returns the input type in string format.

Group¶

inline void tiledb::create_group(const Context &ctx, const std::string &group)¶

Creates a new group. A Group is a logical grouping of Objects on the storage system (a directory).

Parameters:

ctx – The TileDB context.
group – The group URI.

Returns:

void

Object Management¶

class Object¶

Represents a TileDB object: array, group, key-value (map), or none (invalid).

Public Types

enum class Type¶

The object type.

Values:

enumerator Array¶: TileDB array object.

enumerator Group¶: TileDB group object.

enumerator Invalid¶: Invalid or unknown object type.

Public Functions

inline std::string to_str() const¶: Returns a string representation of the object, including its type and URI.

inline Type type() const¶: Returns the object type.

inline std::string uri() const¶: Returns the object URI.

inline std::optional<std::string> name() const¶: Returns the object optional Name.

inline bool operator==(const Object &rhs) const¶: Compares configs for equality.

inline bool operator!=(const Object &rhs) const¶: Compares configs for inequality.

Public Static Functions

static inline Object object(const Context &ctx, const std::string &uri)¶

Gets an Object object that encapsulates the object type of the given path.

Parameters:

ctx – The TileDB context
uri – The path to the object.

Returns:

An object that contains the type along with the URI.

static inline void remove(const Context &ctx, const std::string &uri)¶

Deletes a TileDB object at the given URI from disk/persistent storage.

Parameters:

ctx – The TileDB context
uri – The path to the object to be removed.

static inline void move(const Context &ctx, const std::string &old_uri, const std::string &new_uri)¶

Moves/renames a TileDB object.

Parameters:

old_uri – The path to the old object.
new_uri – The path to the new object.

class ObjectIter¶

Enables listing TileDB objects in a directory or walking recursively an entire directory tree.

Example:

// List the TileDB objects in an S3 bucket.
tiledb::Context ctx;
tiledb::ObjectIter obj_it(ctx, "s3://bucket-name");
for (auto it = obj_it.begin(), ite = obj_it.end(); it != ite; ++it) {
  const tiledb::Object &obj = *it;
  std::cout << obj << std::endl;
}

Public Functions

inline explicit ObjectIter(Context &ctx, const std::string &root = ".")¶

Creates an object iterator. Unless set_recursive is invoked, this iterator will iterate only over the children of root. It will also retrieve only TileDB-related objects.

Example:

// List the TileDB objects in an S3 bucket.
tiledb::Context ctx;
tiledb::ObjectIter obj_it(ctx, "s3://bucket-name");
for (auto it = obj_it.begin(), ite = obj_it.end(); it != ite; ++it) {
  const tiledb::Object &obj = *it;
  std::cout << obj << std::endl;
}

Parameters:

ctx – The TileDB context.
root – The root directory where the iteration will begin.

inline void set_iter_policy(bool group, bool array)¶

Determines whether group, array and key-value objects will be iterated on during the walk. The default (if the function is not invoked) is true for all objects.

Parameters:

group – If true, groups will be considered.
array – If true, arrays will be considered.

inline void set_recursive(tiledb_walk_order_t walk_order = TILEDB_PREORDER)¶

Specifies that the iteration will be over all the directories in the tree rooted at root_.

Parameters:: walk_order – The walk order.

inline void set_non_recursive()¶: Disables recursive traversal.

inline iterator begin()¶: Returns an object iterator at the beginning of its iteration.

inline iterator end() const¶: Returns an object iterator at the end of its iteration.

Public Static Functions

static inline int obj_getter(const char *path, tiledb_object_t type, void *data)¶

Callback function to be used when invoking the C TileDB functions for walking through the TileDB objects in the root_ diretory. The function retrieves the visited object and stored it in the object vector obj_vec.

Parameters:

path – The path of a visited TileDB object
type – The type of the visited TileDB object.
data – To be casted to the vector where the visited object will be stored.

Returns:

If 1 then the walk should continue to the next object.

class iterator¶: The actual iterator implementation in this class.

struct ObjGetterData¶: Carries data to be passed to obj_getter.

VFS¶

class VFS¶

Implements a virtual filesystem that enables performing directory/file operations with a unified API on different filesystems, such as local posix/windows, S3, etc.

Public Types

using filebuf = impl::VFSFilebuf¶

Stream buffer for Tiledb VFS.

This is unbuffered; each read/write is directly dispatched to TileDB. As such it is recommended to issue fewer, larger, operations.

Example (write to file):

// Create the file buffer.
tiledb::Context ctx;
tiledb::VFS vfs(ctx);
tiledb::VFS::filebuf buff(vfs);

// Create new file, truncating it if it exists.
buff.open("file.txt", std::ios::out);
std::ostream os(&buff);
if (!os.good()) throw std::runtime_error("Error opening file");

std::string str = "This will be written to the file.";

os.write(str.data(), str.size());
// Alternatively:
// os << str;
os.flush();
buff.close();

Example (read from file):

// Create the file buffer.
tiledb::Context ctx;
tiledb::VFS vfs(ctx);
tiledb::VFS::filebuf buff(vfs);
std::string file_uri = "s3://bucket-name/file.txt";

buff.open(file_uri, std::ios::in);
std::istream is(&buff);
if (!is.good()) throw std::runtime_error("Error opening file);

// Read all contents from the file
std::string contents;
auto nbytes = vfs.file_size(file_uri);
contents.resize(nbytes);
vfs.read((char*)contents.data(), nbytes);

buff.close();

Public Functions

inline explicit VFS(const Context &ctx)¶

Constructor.

Parameters:: ctx – A TileDB context.

inline VFS(const Context &ctx, const Config &config)¶

Constructor.

Parameters:

ctx – TileDB context.
config – TileDB config.

inline VFS(const Context &ctx, tiledb_vfs_t *vfs, bool own = true)¶

Constructor. Creates a TileDB VFS from the given pointer.

Parameters:: own=true – If false, disables underlying cleanup upon destruction.
Throws:: TileDBError – if construction fails

inline void create_bucket(const std::string &uri) const¶: Creates an object store bucket with the input URI.

inline void remove_bucket(const std::string &uri) const¶: Deletes an object store bucket with the input URI.

inline bool is_bucket(const std::string &uri) const¶: Checks if an object store bucket with the input URI exists.

inline void empty_bucket(const std::string &bucket) const¶: Empty an object store bucket

inline bool is_empty_bucket(const std::string &bucket) const¶: Check if an object store bucket is empty

inline void create_dir(const std::string &uri) const¶: Creates a directory with the input URI.

inline bool is_dir(const std::string &uri) const¶: Checks if a directory with the input URI exists.

inline void remove_dir(const std::string &uri) const¶: Removes a directory (recursively) with the input URI.

inline bool is_file(const std::string &uri) const¶: Checks if a file with the input URI exists.

inline void remove_file(const std::string &uri) const¶: Deletes a file with the input URI.

inline uint64_t dir_size(const std::string &uri) const¶: Retrieves the size of a directory with the input URI.

inline std::vector<std::string> ls(const std::string &uri) const¶: Retrieves the children in directory uri. This function is non-recursive, i.e., it focuses in one level below uri.

inline uint64_t file_size(const std::string &uri) const¶: Retrieves the size of a file with the input URI.

inline void move_file(const std::string &old_uri, const std::string &new_uri) const¶: Renames a TileDB file from an old URI to a new URI.

inline void move_dir(const std::string &old_uri, const std::string &new_uri) const¶: Renames a TileDB directory from an old URI to a new URI.

inline void copy_file(const std::string &old_uri, const std::string &new_uri) const¶: Copies a TileDB file from an old URI to a new URI.

inline void copy_dir(const std::string &old_uri, const std::string &new_uri) const¶: Copies a TileDB directory from an old URI to a new URI.

inline void touch(const std::string &uri) const¶: Touches a file with the input URI, i.e., creates a new empty file if it does not already exist.

inline const Context &context() const¶: Get the underlying context

inline std::shared_ptr<tiledb_vfs_t> ptr() const¶: Get the underlying tiledb object

inline Config config() const¶: Get the config

Public Static Functions

static inline int ls_getter(const char *path, void *data)¶

Callback function to be used when invoking the C TileDB function for getting the children of a URI. It simply adds path to vec (which is casted from data).

Parameters:

path – The path of a visited TileDB object
data – This will be casted to the vector that will store path.

Returns:

If 1 then the walk should continue to the next object.

Utils¶

LICENSE¶

The MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

DESCRIPTION¶

Utils for C++ API.

namespace tiledb¶

Functions

template<typename T, typename E = typename std::vector<T>> std::vector<E> group_by_cell(const std::vector<uint64_t> &offsets, const std::vector<T> &data, uint64_t num_offsets, uint64_t num_data)¶

Convert an (offset, data) vector pair into a single vector of vectors. Useful for “unpacking” variable-length attribute data from a read query result in offsets + data form to a vector of per-cell data.

The offsets must be given in units of bytes.

Example:

std::vector<uint64_t> offsets;
std::vector<char> data;
...
query.set_data_buffer("attr_name", data);
query.set_offsets_buffer("attr_name", offsets);
query.submit();
...
auto attr_results = query.result_buffer_elements()["attr_name"];

// cell_vals length will be equal to the number of cells read by the query.
// Each element is a std::vector<char> with each cell's data for "attr_name"
auto cell_vals =
  group_by_cell(offsets, data, attr_results.first, attr_results.second);

// Reconstruct a std::string value for the first cell:
std::string cell_val(cell_vals[0].data(), cell_vals[0].size());

Note

This function, and the other utility functions, copy all of the input data when constructing their return values. Thus, these may be expensive for large amounts of data.

Template Parameters:

T – Underlying attribute datatype
E – Cell type. usually std::vector<T> or std::string. Must be constructable by {std::vector<T>::iterator, std::vector<T>::iterator}

Parameters:

offsets – Offsets vector. This specifies the start offset in bytes of each cell in the data vector.
data – Data vector. Flat data buffer with cell contents.
num_offsets – Number of offset elements populated by query. If the entire buffer is to be grouped, pass offsets.size().
num_data – Number of data elements populated by query. If the entire buffer is to be grouped, pass data.size().

Returns:

std::vector<E>

template<typename T, typename E = typename std::vector<T>> std::vector<E> group_by_cell(const std::pair<std::vector<uint64_t>, std::vector<T>> &buff, uint64_t num_offsets, uint64_t num_data)¶

Convert an (offset, data) vector pair into a single vector of vectors. Useful for “unpacking” variable-length attribute data from a read query result in offsets + data form to a vector of per-cell data.

The offsets must be given in units of bytes.

Example:

std::vector<uint64_t> offsets;
std::vector<char> data;
...
query.set_data_buffer("attr_name", data);
query.set_offsets_buffer("attr_name", offsets);
query.submit();
...
auto attr_results = query.result_buffer_elements()["attr_name"];

// cell_vals length will be equal to the number of cells read by the query.
// Each element is a std::vector<char> with each cell's data for "attr_name"
auto cell_vals =
  group_by_cell(std::make_pair(offsets, data),
                attr_results.first, attr_results.second);

// Reconstruct a std::string value for the first cell:
std::string cell_val(cell_vals[0].data(), cell_vals[0].size());

Template Parameters:

T – Underlying attribute datatype
E – Cell type. usually std::vector<T> or std::string. Must be constructable by {std::vector<T>::iterator, std::vector<T>::iterator}

Parameters:

buff – Pair of (offset_vec, data_vec) to be grouped.
num_offsets – Number of offset elements populated by query.
num_data – Number of data elements populated by query.

Returns:

std::vector<E>

template<typename T, typename E = typename std::vector<T>> std::vector<E> group_by_cell(const std::vector<uint64_t> &offsets, const std::vector<T> &data)¶

Convert a generic (offset, data) vector pair into a single vector of vectors. The offsets must be given in units of bytes.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
std::vector<uint64_t> offsets = {0, 5};
auto grouped = group_by_cell<char, std::string>(offsets, buf);
// grouped.size() == 2
// grouped[0] == "abcde"
// grouped[1] == "fghi"

Template Parameters:

T – Underlying attribute datatype
E – Cell type. usually std::vector<T> or std::string. Must be constructable by {std::vector<T>::iterator, std::vector<T>::iterator}

Parameters:

offsets – Offsets vector
data – Data vector

Returns:

std::vector<E>

template<typename T, typename E = typename std::vector<T>> std::vector<E> group_by_cell(const std::vector<T> &buff, uint64_t el_per_cell, uint64_t num_buff)¶

Convert a vector of elements into a vector of fixed-length vectors.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
auto grouped = group_by_cell(buf, 3, buf.size());
std::string grp1(grouped[0].begin(), grouped[0].end());  // "abc"
std::string grp2(grouped[1].begin(), grouped[1].end());  // "def"
std::string grp3(grouped[2].begin(), grouped[2].end());  // "ghi"

// Throws an exception because buf.size() is not divisible by 2:
// group_by_cell(buf, 2, buf.size());

Template Parameters:

T – Underlying attribute datatype
E – Cell type. usually std::vector<T> or std::string. Must be constructable by {std::vector<T>::iterator, std::vector<T>::iterator}

Parameters:

buff – Data buffer to group
el_per_cell – Number of elements per cell to group together
num_buff – Number of elements populated by query. To group whole buffer, pass buff.size().

Returns:

std::vector<E>

template<typename T, typename E = typename std::vector<T>> std::vector<E> group_by_cell(const std::vector<T> &buff, uint64_t el_per_cell)¶

Convert a vector of elements into a vector of fixed-length vectors.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
auto grouped = group_by_cell(buf, 3);
std::string grp1(grouped[0].begin(), grouped[0].end());  // "abc"
std::string grp2(grouped[1].begin(), grouped[1].end());  // "def"
std::string grp3(grouped[2].begin(), grouped[2].end());  // "ghi"

// Throws an exception because buf.size() is not divisible by 2:
// group_by_cell(buf, 2);

Template Parameters:

T – Element type
E – Cell type. usually std::vector<T> or std::string. Must be constructable by {std::vector<T>::iterator, std::vector<T>::iterator}

Parameters:

buff – Data buffer to group
el_per_cell – Number of elements per cell to group together

Returns:

std::vector<E>

template<uint64_t N, typename T> std::vector<std::array<T, N>> group_by_cell(const std::vector<T> &buff, uint64_t num_buff)¶

Convert a vector of elements into a vector of fixed-length arrays.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
auto grouped = group_by_cell<3>(buf, buf.size());
std::string grp1(grouped[0].begin(), grouped[0].end());  // "abc"
std::string grp2(grouped[1].begin(), grouped[1].end());  // "def"
std::string grp3(grouped[2].begin(), grouped[2].end());  // "ghi"

// Throws an exception because buf.size() is not divisible by 2:
// group_by_cell<2>(buf, buf.size());

Template Parameters:

N – Elements per cell
T – Array element type

Parameters:

buff – Data buffer to group
num_buff – Number of elements in buff that were populated by the query.

Returns:

std::vector<std::array<T,N>>

template<uint64_t N, typename T> std::vector<std::array<T, N>> group_by_cell(const std::vector<T> &buff)¶

Convert a vector of elements into a vector of fixed-length arrays.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
auto grouped = group_by_cell<3>(buf);
std::string grp1(grouped[0].begin(), grouped[0].end());  // "abc"
std::string grp2(grouped[1].begin(), grouped[1].end());  // "def"
std::string grp3(grouped[2].begin(), grouped[2].end());  // "ghi"

// Throws an exception because buf.size() is not divisible by 2:
// group_by_cell<2>(buf);

Template Parameters:

N – Elements per cell
T – Array element type

Parameters:

buff – data buff to group

Returns:

std::vector<std::array<T,N>>

template<typename T, typename R = typename T::value_type> std::pair<std::vector<uint64_t>, std::vector<R>> ungroup_var_buffer(const std::vector<T> &data)¶

Unpack a vector of variable sized attributes into a data and offset buffer. The offset buffer result is in units of bytes.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
// For the sake of example, group buf into groups of 3 elements:
auto grouped = group_by_cell(buf, 3);
// Ungroup into offsets, data pair.
auto p = ungroup_var_buffer(grouped);
auto offsets = p.first;  // {0, 3, 6}
auto data = p.second;   // {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}

Template Parameters:

T – Vector type. T::value_type is considered the underlying data element type. Should be vector or string.
R – T::value_type, deduced

Parameters:

data – Data buffer to unpack

Returns:

pair where .first is the offset buffer, and .second is data buffer

template<typename V, typename T = typename V::value_type::value_type> std::vector<T> flatten(const V &vec)¶

Convert a vector-of-vectors and flatten it into a single vector.

Example:

std::vector<std::string> v = {"a", "bb", "ccc"};
auto flat_v = flatten(v);
std::string s(flat_v.begin(), flat_v.end()); // "abbccc"

std::vector<std::vector<double>> d = {{1.2, 2.1}, {2.3, 3.2}, {3.4, 4.3}};
auto flat_d = flatten(d);  // {1.2, 2.1, 2.3, 3.2, 3.4, 4.3};

Template Parameters:

V – Container type
T – Return element type

Parameters:

vec – Vector to flatten

Returns:

std::vector<T>

namespace impl¶

Functions

inline void check_config_error(tiledb_error_t *err)¶: Check an error, free, and throw if there is one.

Version¶

inline std::tuple<int, int, int> tiledb::version()¶: Get the Major, Minor, and Patch version.

Stats¶

class Stats¶

Encapsulates functionality related to internal TileDB statistics.

Example:

// Enable stats, submit a query, then dump to stdout.
tiledb::Stats::enable();
query.submit();
tiledb::Stats::dump();

// Dump to a string instead.
std::string str;
tiledb::Stats::dump(&str);

Public Static Functions

static inline void enable()¶: Enables internal TileDB statistics gathering.

static inline void disable()¶: Disables internal TileDB statistics gathering.

static inline bool is_enabled()¶

Returns whether internal statistics gathering is enabled.

Returns:: true if statistics gathering is enabled and false otherwise.

static inline void reset()¶: Reset all internal statistics counters to 0.

static inline void dump(FILE *out = nullptr)¶

Dump all statistics counters to some output (e.g., file or stdout).

Parameters:: out – The output.

static inline void dump(std::string *out)¶

Dump all statistics counters to a string.

Parameters:: out – The output.

static inline void raw_dump(FILE *out = nullptr)¶

Dump all raw statistics counters to some output (e.g., file or stdout) as a JSON.

Parameters:: out – The output.

static inline void raw_dump(std::string *out)¶

Dump all raw statistics counters to a string.

Parameters:: out – The output.

FragmentInfo¶

class FragmentInfo¶

Describes fragment info objects.

Public Functions

inline void load() const¶: Loads the fragment info.

inline std::string fragment_uri(uint32_t fid) const¶: Returns the URI of the fragment with the given index.

inline std::string fragment_name(uint32_t fid) const¶: Returns the name of the fragment with the given index.

inline const Context &context() const¶: Returns the context that the fragment info belongs to.

inline void get_non_empty_domain(uint32_t fid, uint32_t did, void *domain) const¶: Retrieves the non-empty domain of the fragment with the given index on the given dimension index.

inline void get_non_empty_domain(uint32_t fid, const std::string &dim_name, void *domain) const¶: Retrieves the non-empty domain of the fragment with the given index on the given dimension name.

inline std::pair<std::string, std::string> non_empty_domain_var(uint32_t fid, uint32_t did) const¶: Returns the non-empty domain of the fragment with the given index on the given dimension index. Applicable to string dimensions.

inline std::pair<std::string, std::string> non_empty_domain_var(uint32_t fid, const std::string &dim_name) const¶: Returns the non-empty domain of the fragment with the given index on the given dimension name. Applicable to string dimensions.

inline uint64_t mbr_num(uint32_t fid) const¶: Returns the number of MBRs in the fragment with the given index.

inline void get_mbr(uint32_t fid, uint32_t mid, uint32_t did, void *mbr) const¶: Retrieves the MBR of the fragment with the given index on the given dimension index.

inline void get_mbr(uint32_t fid, uint32_t mid, const std::string &dim_name, void *mbr) const¶: Retrieves the MBR of the fragment with the given index on the given dimension name.

inline std::pair<std::string, std::string> mbr_var(uint32_t fid, uint32_t mid, uint32_t did) const¶: Returns the MBR of the fragment with the given index on the given dimension index. Applicable to string dimensions.

inline std::pair<std::string, std::string> mbr_var(uint32_t fid, uint32_t mid, const std::string &dim_name) const¶: Returns the MBR of the fragment with the given index on the given dimension name. Applicable to string dimensions.

inline uint32_t fragment_num() const¶: Returns the number of fragments.

inline uint64_t fragment_size(uint32_t fid) const¶: Returns the size of the fragment with the given index.

inline bool dense(uint32_t fid) const¶: Returns true if the fragment with the given index is dense.

inline bool sparse(uint32_t fid) const¶: Returns true if the fragment with the given index is sparse.

inline std::pair<uint64_t, uint64_t> timestamp_range(uint32_t fid) const¶: Returns the timestamp range of the fragment with the given index.

inline uint64_t cell_num(uint32_t fid) const¶: Returns the number of cells of the fragment with the given index.

inline uint64_t total_cell_num() const¶: Returns the total number of cells written in the loaded fragments.

inline uint32_t version(uint32_t fid) const¶: Returns the version of the fragment with the given index.

inline ArraySchema array_schema(uint32_t fid) const¶: Returns the array schema of the fragment with the given index.

inline std::string array_schema_name(uint32_t fid) const¶: Returns the array schema name of the fragment with the given index.

inline bool has_consolidated_metadata(uint32_t fid) const¶: Returns true if the fragment with the given index has consolidated metadata.

inline uint32_t unconsolidated_metadata_num() const¶: Returns the number of fragments with unconsolidated metadata.

inline uint32_t to_vacuum_num() const¶: Returns the number of fragments to vacuum.

inline std::string to_vacuum_uri(uint32_t fid) const¶: Returns the URI of the fragment to vacuum with the given index.

TILEDB_DEPRECATED inline void dump(FILE *out = nullptr) const¶

Dumps the fragment info in an ASCII representation to an output.

Parameters:: out – (Optional) File to dump output to. Defaults to nullptr which will lead to selection of stdout.

inline std::shared_ptr<tiledb_fragment_info_t> ptr() const¶: Returns the C TileDB context object.

Experimental¶

class ArraySchemaEvolution¶

Evolve the schema on a tiledb::Array.

See examples for more usage details.

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::ArraySchemaEvolution evolution(ctx);
evolution.drop_attribute("a1");
evolution.array_evolve("my_test_array");

Public Functions

inline ArraySchemaEvolution(const Context &context, tiledb_array_schema_evolution_t *evolution)¶

Constructs the array schema evolution with the input C array array schema evolution object.

Parameters:

ctx – TileDB context
evolution – C API array schema evolution object

inline ArraySchemaEvolution(const Context &context)¶

Constructs an array schema evolution object.

Parameters:: ctx – TileDB context

inline ArraySchemaEvolution &add_attribute(const Attribute &attr)¶

Adds an Attribute to the array schema evolution.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
schema_evolution.add_attribute(Attribute::create<int32_t>(ctx,
"attr_name"));

Parameters:: attr – The Attribute to add
Returns:: Reference to this ArraySchemaEvolution instance.

inline ArraySchemaEvolution &drop_attribute(const std::string &attribute_name)¶

Drops an attribute.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
schema_evolution.drop_attribute("attr_name");

Parameters:: attr – The attribute to be dropped
Returns:: Reference to this ArraySchemaEvolution instance.

inline ArraySchemaEvolution &add_enumeration(const Enumeration &enmr)¶

Adds an Enumeration to the array schema evolution.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
std::vector<std::string> values = {"red", "green", "blue"};
schema_evolution.add_enumeration(Enumeration::create(ctx, "an_enumeration",
values));

Parameters:: enmr – The Enumeration to add.
Returns:: Reference to this ArraySchemaEvolution instance.

inline ArraySchemaEvolution &extend_enumeration(const Enumeration &enmr)¶

Extends an Enumeration during array schema evolution.

Example:

tiledb::Context ctx;
tiledb::Enumeration old_enmr = array->get_enumeration("some_enumeration");
std::vector<std::string> new_values = {"cyan", "magenta", "mauve"};
tiledb::Enumeration new_enmr = old_enmr->extend(new_values);
tiledb::ArraySchemaEvolution schema_evolution(ctx);
schema_evolution.extend_enumeration(new_enmr);

Parameters:: enmr – The Enumeration to extend.
Returns:: Reference to this ArraySchemaEvolution instance.

inline ArraySchemaEvolution &drop_enumeration(const std::string &enumeration_name)¶

Drops an enumeration.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
schema_evolution.drop_enumeration("enumeration_name");

Parameters:: enumeration_name – The enumeration to be dropped
Returns:: Reference to this ArraySchemaEvolution instance.

inline ArraySchemaEvolution &expand_current_domain(const CurrentDomain &expanded_domain)¶

Expands the current domain during array schema evolution. TileDB will enforce that the new current domain is expanding on the current one and not contracting during tiledb_array_evolve.

Parameters:: expanded_domain – The current domain we want to expand the schema to.

inline void set_timestamp_range(const std::pair<uint64_t, uint64_t> &timestamp_range)¶

Sets timestamp range.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
uint64_t now = tiledb_timestamp_now_ms()
schema_evolution.set_timestamp_range({now, now});

Parameters:: timestamp_range – The timestamp range to be set

inline ArraySchemaEvolution &array_evolve(const std::string &array_uri)¶

Evolves the schema of an array.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
schema_evolution.drop_attribute("attr_name");
schema_evolution.array_evolve("test_array_uri");

Parameters:: array_uri – The uri of an array
Returns:: Reference to this ArraySchemaEvolution instance.

inline std::shared_ptr<tiledb_array_schema_evolution_t> ptr() const¶: Returns a shared pointer to the C TileDB array schema evolution object.

class Group¶

Public Functions

inline Group(const Context &ctx, const std::string &group_uri, tiledb_query_type_t query_type)¶

Constructor. Opens the group for the given query type. The destructor calls the close() method.

Example:

// Open the group for reading
tiledb::Context ctx;
tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_READ);

Parameters:

ctx – TileDB context.
group_uri – The group URI.
query_type – Query type to open the group for.

inline Group(const Context &ctx, const std::string &group_uri, tiledb_query_type_t query_type, const Config &config)¶

Constructor. Sets a config to the group and opens it for the given query type. The destructor calls the close() method.

Example:

// Open the group for reading
tiledb::Context ctx;
tiledb::Config cfg;
cfg["rest.username"] = "user";
cfg["rest.password"] = "pass";
tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_READ, cfg);

Parameters:

ctx – TileDB context.
group_uri – The group URI.
query_type – Query type to open the group for.
config – COnfiguration parameters

inline ~Group()¶: Destructor; calls close().

inline void open(tiledb_query_type_t query_type)¶

Opens the group using a query type as input.

This is to indicate that queries created for this Group object will inherit the query type. In other words, Group objects are opened to receive only one type of queries. They can always be closed and be re-opened with another query type. Also there may be many different Group objects created and opened with different query types. For instance, one may create and open an group object group_read for reads and another one group_write for writes, and interleave creation and submission of queries for both these group objects.

Example:

// Open the group for writing
tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_WRITE);
// Close and open again for reading.
group.close();
group.open(TILEDB_READ);

Parameters:: query_type – The type of queries the group object will be receiving.
Throws:: TileDBError – if the group is already open or other error occurred.

inline void set_config(const Config &config) const¶

Sets the group config.

Pre:: The group must be closed.

inline Config config() const¶: Retrieves the group config.

inline void close(bool should_throw = true)¶

Closes the group. This must be called directly if you wish to check that any changes to the group were committed. This is automatically called by the destructor but any errors encountered are logged instead of throwing an exception from a destructor.

Example:

tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_READ);
group.close();

inline bool is_open() const¶: Checks if the group is open.

inline std::string uri() const¶: Returns the group URI.

inline tiledb_query_type_t query_type() const¶: Returns the query type the group was opened with.

inline void put_metadata(const std::string &key, tiledb_datatype_t value_type, uint32_t value_num, const void *value)¶

Puts a metadata key-value item to an open group. The group must be opened in WRITE mode, otherwise the function will error out.

Note

The writes will take effect only upon closing the group.

Parameters:

key – The key of the metadata item to be added. UTF-8 encodings are acceptable.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata.
value – The metadata value in binary form.

inline void delete_group(const std::string &uri, bool recursive = false)¶

Deletes all written data from an open group. The group must be opened in MODIFY_EXCLUSIVE mode, otherwise the function will error out.

Note

if recursive == false, data added to the group will be left as-is.

Parameters:

uri – The address of the group item to be deleted.
recursive – True if all data inside the group is to be deleted.

Post:

This is destructive; the group may not be reopened after delete.

inline void delete_metadata(const std::string &key)¶

Deletes a metadata key-value item from an open group. The group must be opened in WRITE mode, otherwise the function will error out.

Note

The writes will take effect only upon closing the group.

Note

If the key does not exist, this will take no effect (i.e., the function will not error out).

Parameters:: key – The key of the metadata item to be deleted.

inline void get_metadata(const std::string &key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)¶

Gets a metadata key-value item from an open group. The group must be opened in READ mode, otherwise the function will error out.

Note

If the key does not exist, then value will be NULL.

Parameters:

key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.
value – The metadata value in binary form.

inline bool has_metadata(const std::string &key, tiledb_datatype_t *value_type)¶

Checks if key exists in metadata from an open group. The group must be opened in READ mode, otherwise the function will error out.

Note

If the key does not exist, then value_type will not be modified.

Parameters:

key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.
value_type – The datatype of the value associated with the key (if any).

Returns:

true if the key exists, else false.

inline uint64_t metadata_num() const¶: Returns then number of metadata items in an open group. The group must be opened in READ mode, otherwise the function will error out.

inline void get_metadata_from_index(uint64_t index, std::string *key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)¶

Gets a metadata item from an open group using an index. The group must be opened in READ mode, otherwise the function will error out.

Parameters:

index – The index used to get the metadata.
key – The metadata key.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.
value – The metadata value in binary form.

inline void add_member(const std::string &uri, const bool &relative, std::optional<std::string> name = std::nullopt, std::optional<tiledb_object_t> type = std::nullopt)¶

Add a member to a group

Parameters:

uri – of member to add
relative – is the URI relative to the group location
name – optional name group member can be given to be looked up by
type – the type of the member getting added if known in advance

inline void remove_member(const std::string &name_or_uri)¶

Remove a member from a group

Parameters:: name_or_uri – Name or URI of member to remove. If the URI is registered multiple times in the group, the name needs to be specified so that the correct one can be removed. Note that if a URI is registered as both a named and unnamed member, the unnamed member will be removed successfully using the URI.

inline bool is_relative(std::string name) const¶

retrieve the relative attribute for a named member

Parameters:: name – of member to retrieve associated relative indicator.

Public Static Functions

static inline void create(const tiledb::Context &ctx, const std::string &uri)¶

Create a TileDB Group

Example:

tiledb::Group::create(ctx, "s3://bucket-name/group-name");

Parameters:

ctx – tiledb context
uri – URI where group will be created.

static inline void consolidate_metadata(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶

Consolidates the group metadata into a single group metadata file.

Example:

tiledb::Group::consolidate_metadata(ctx, "s3://bucket-name/group-name");

Parameters:

ctx – TileDB context
uri – The URI of the TileDB group to be consolidated.
config – Configuration parameters for the consolidation.

static inline void vacuum_metadata(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶

Cleans up the group metadata.

Example:

tiledb::Group::vacuum_metadata(ctx, "s3://bucket-name/group-name");

Parameters:

ctx – TileDB context
uri – The URI of the TileDB group to vacuum.
config – Configuration parameters for the vacuuming.

TileDB C++ API Reference¶

Context¶

Config¶

Exceptions¶

Dimension¶

Domain¶

Attribute¶

Array Schema¶

Array¶

Query¶

QueryCondition¶

Subarray¶

Filter¶

Filter List¶

Group¶

Object Management¶

VFS¶

Utils¶

LICENSE¶

DESCRIPTION¶

Version¶

Stats¶

FragmentInfo¶

Experimental¶

Table of Contents

Previous topic

This Page