TileDB C++ API Reference

Context

class Context

A TileDB context wraps a TileDB storage manager “instance.” Most objects and functions will require a Context.

Internal error handling is also defined by the Context; the default error handler throws a TileDBError with a specific message.

Example:

tiledb::Context ctx;
// Use ctx when creating other objects:
tiledb::ArraySchema schema(ctx, TILEDB_SPARSE);

// Set a custom error handler:
ctx.set_error_handler([](const std::string &msg) {
    std::cerr << msg << std::endl;
});

Public Functions

inline Context()

Constructor. Creates a TileDB Context with default configuration.

Throws:

TileDBError – if construction fails

inline explicit Context(const Config &config)

Constructor. Creates a TileDB context with the given configuration.

Throws:

TileDBError – if construction fails

inline Context(tiledb_ctx_t *ctx, bool own = true)

Constructor. Creates a TileDB context from the given pointer.

Parameters:

own=true – If false, disables underlying cleanup upon destruction.

Throws:

TileDBError – if construction fails

inline void handle_error(int rc) const

Error handler for the TileDB C API calls. Throws an exception in case of error.

Parameters:

rc – If != TILEDB_OK, calls error handler

inline std::string get_last_error_message() const noexcept

Get the message of the last error that occurred.

Returns:

The last error message

inline std::shared_ptr<tiledb_ctx_t> ptr() const

Returns the C TileDB context object.

inline Context &set_error_handler(const std::function<void(const std::string&)> &fn)

Sets the error handler callback. If none is set, the default_error_handler is used. The callback accepts an error message.

Parameters:

fn – Error handler callback function

Returns:

Reference to this Context

inline Config config() const

Returns a copy of the configuration of the context.

inline bool is_supported_fs(tiledb_filesystem_t fs) const

Return true if the given filesystem backend is supported.

Example:

tiledb::Context ctx;
bool s3_supported = ctx.is_supported_fs(TILEDB_S3);

Parameters:

fs – Filesystem to check

inline void cancel_tasks() const

Cancels all background or async tasks associated with this context.

inline void set_tag(const std::string &key, const std::string &value)

Sets a string/string KV tag on the context.

inline std::string stats()

Returns a JSON-formatted string of the stats.

Public Static Functions

static inline void default_error_handler(const std::string &msg)

The default error handler callback.

Throws:

TileDBError – with the error message

Config

class Config

Carries configuration parameters for a context.

Example:

Config conf;
conf["vfs.s3.region"] = "us-east-1a";
conf["vfs.s3.use_virtual_addressing"] = "true";
Context ctx(conf);
// array operations with ctx

Public Functions

inline explicit Config(const std::string &filename)

Constructor that takes as input a filename (URI) that stores the config parameters. The file must have the following (text) format:

{parameter} {value}

Anything following a # character is considered a comment and, thus, is ignored.

See Config::set for the various TileDB config parameters and allowed values.

Parameters:

filename – The name of the file where the parameters will be read from.

inline explicit Config(tiledb_config_t **config)

Constructor from a C config object.

inline explicit Config(const std::map<std::string, std::string> &config)

Constructor that takes as input a STL map that stores the config parameters

Parameters:

config

inline explicit Config(const std::unordered_map<std::string, std::string> &config)

Constructor that takes as input a STL unordered_map that stores the config parameters

Parameters:

config

inline void save_to_file(const std::string filename)

Saves the config parameters to a (local) text file.

inline bool operator==(const Config &rhs) const

Compares configs for equality.

inline bool operator!=(const Config &rhs) const

Compares configs for inequality.

inline std::shared_ptr<tiledb_config_t> ptr() const

Returns the pointer to the TileDB C config object.

inline Config &set(const std::string &param, const std::string &value)

Sets a config parameter.

  • sm.allow_separate_attribute_writes Experimental Allow separate attribute write queries.Default: false

  • sm.allow_updates_experimental Experimental Allow update queries. Experimental for testing purposes, do not use.Default: false

  • sm.dedup_coords If true, cells with duplicate coordinates will be removed during sparse fragment writes. Note that ties during deduplication are broken arbitrarily. Also note that this check means that it will take longer to perform the write operation. Default: false

  • sm.check_coord_dups This is applicable only if sm.dedup_coords is false. If true, an error will be thrown if there are cells with duplicate coordinates during sparse fragmnet writes. If false and there are duplicates, the duplicates will be written without errors. Note that this check is much ligher weight than the coordinate deduplication check enabled by sm.dedup_coords. Default: true

  • sm.check_coord_oob If true, an error will be thrown if there are cells with coordinates lying outside the domain during sparse fragment writes. Default: true

  • sm.read_range_oob If error, this will check ranges for read with out-of-bounds on the dimension domain’s. If warn, the ranges will be capped at the dimension’s domain and a warning logged. Default: warn

  • sm.check_global_order Checks if the coordinates obey the global array order. Applicable only to sparse writes in global order. Default: true

  • sm.merge_overlapping_ranges_experimental If true, merge overlapping Subarray ranges. Else, overlapping ranges will not be merged and multiplicities will be returned. Experimental for testing purposes, do not use.Default: true

  • sm.enable_signal_handlers Determines whether or not TileDB will install signal handlers. Default: true

  • sm.compute_concurrency_level Upper-bound on number of threads to allocate for compute-bound tasks. Default*: # cores

  • Upper-bound on number of threads to allocate for IO-bound tasks. **Default*: # cores

  • The vacuuming mode, one of (remove only consolidated commit files), (remove only consolidated fragments), (remove only consolidated fragment metadata), (remove only consolidated array metadata files), or (remove only consolidate group metadata only). **Default: fragments

  • sm.consolidation.mode The consolidation mode, one of commits (consolidate all commit files), fragments (consolidate all fragments), fragment_meta (consolidate only fragment metadata footers to a single file), array_meta (consolidate array metadata only), or group_meta (consolidate group metadata only). Default: “fragments”

  • sm.consolidation.amplification The factor by which the size of the dense fragment resulting from consolidating a set of fragments (containing at least one dense fragment) can be amplified. This is important when the union of the non-empty domains of the fragments to be consolidated have a lot of empty cells, which the consolidated fragment will have to fill with the special fill value (since the resulting fragment is dense). Default: 1.0

  • sm.consolidation.buffer_size Deprecated The size (in bytes) of the attribute buffers used during consolidation. Default: 50,000,000

  • sm.consolidation.max_fragment_size Experimental The size (in bytes) of the maximum on-disk fragment size that will be created by consolidation. When it is reached, consolidation will continue the operation in a new fragment. The result will be a multiple fragments, but with seperate MBRs.

  • sm.consolidation.steps The number of consolidation steps to be performed when executing the consolidation algorithm.Default: UINT32_MAX

  • sm.consolidation.purge_deleted_cells Experimental Purge deleted cells from the consolidated fragment or not.Default: false

  • sm.consolidation.step_min_frags The minimum number of fragments to consolidate in a single step.Default: UINT32_MAX

  • sm.consolidation.step_max_frags The maximum number of fragments to consolidate in a single step.Default: UINT32_MAX

  • sm.consolidation.step_size_ratio The size ratio that two (“adjacent”) fragments must satisfy to be considered for consolidation in a single step.Default: 0.0

  • sm.consolidation.timestamp_start Experimental When set, an array will be consolidated between this value and sm.consolidation.timestamp_end

    (inclusive).

    Only for

    fragments and array_meta consolidation mode. Default: 0

  • sm.consolidation.timestamp_end Experimental When set, an array will be consolidated between sm.consolidation.timestamp_start

    and this value (inclusive).

    Only for

    fragments and array_meta consolidation mode. Default: UINT64_MAX

  • sm.encryption_key The key for encrypted arrays. Default: “”

  • sm.encryption_type The type of encryption used for encrypted arrays. Default: “NO_ENCRYPTION”

  • sm.enumerations_max_size

    Maximum in memory size for an enumeration. If the enumeration is

    var sized, the size will include the data and the offsets.

    Default: 10MB

  • sm.enumerations_max_total_size

    Maximum in memory size for all enumerations. If the enumeration

    is var sized, the size will include the data and the offsets.

    Default: 50MB

  • sm.max_tile_overlap_size

    Maximum size for the tile overlap structure which holds

    information about which tiles are covered by ranges. Only used

    in dense reads and legacy reads. Default: 300MB

  • sm.memory_budget The memory budget for tiles of fixed-sized attributes (or offsets for var-sized attributes) to be fetched during reads.Default: 5GB

  • sm.memory_budget_var The memory budget for tiles of var-sized attributes to be fetched during reads.Default: 10GB

  • sm.var_offsets.bitsize The size of offsets in bits to be used for offset buffers of var-sized attributesDefault: 64

  • sm.var_offsets.extra_element Add an extra element to the end of the offsets buffer of var-sized attributes which will point to the end of the values buffer.Default: false

  • sm.var_offsets.mode The offsets format (bytes or elements) to be used for var-sized attributes.Default: bytes

  • sm.query.dense.reader Which reader to use for dense queries. “refactored” or “legacy”.Default: refactored

  • sm.query.dense.qc_coords_mode

    Dense configuration that allows to only return the coordinates of

    the cells that match a query condition without any attribute data.

    Default: “false”

  • sm.query.sparse_global_order.reader Which reader to use for sparse global order queries. “refactored” or “legacy”.Default: refactored

  • sm.query.sparse_unordered_with_dups.reader Which reader to use for sparse unordered with dups queries. “refactored” or “legacy”.Default: refactored

  • sm.skip_checksum_validation Skip checksum validation on reads for the md5 and sha256 filters. Default: “false”

  • sm.mem.malloc_trim Should malloc_trim be called on context and query destruction? This might reduce residual memory usage. Default: true

  • sm.mem.tile_upper_memory_limit Experimental This is the upper memory limit that is used when loading tiles. For now it is only used in the dense reader but will be eventually used by all readers. The readers using this value will use it as a way to limit the amount of tile data that is brought into memory at once so that we don’t incur performance penalties during memory movement operations. It is a soft limit that we might go over if a single tile doesn’t fit into memory, we will allow to load that tile if it still fits within sm.mem.total_budget. Default: 1GB

  • sm.mem.total_budget Memory budget for readers and writers. Default: 10GB

  • sm.mem.consolidation.buffers_weight Weight used to split sm.mem.total_budget and assign to the consolidation buffers. The budget is split across 3 values, sm.mem.consolidation.buffers_weight, sm.mem.consolidation.reader_weight and sm.mem.consolidation.writer_weight. Default: 1

  • sm.mem.consolidation.reader_weight Weight used to split sm.mem.total_budget and assign to the reader query. The budget is split across 3 values, sm.mem.consolidation.buffers_weight, sm.mem.consolidation.reader_weight and sm.mem.consolidation.writer_weight. Default: 3

  • sm.mem.consolidation.writer_weight Weight used to split sm.mem.total_budget and assign to the writer query. The budget is split across 3 values, sm.mem.consolidation.buffers_weight, sm.mem.consolidation.reader_weight and sm.mem.consolidation.writer_weight. Default: 2

  • sm.mem.reader.sparse_global_order.ratio_coords Ratio of the budget allocated for coordinates in the sparse global order reader. Default: 0.5

  • sm.mem.reader.sparse_global_order.ratio_tile_ranges Ratio of the budget allocated for tile ranges in the sparse global order reader. Default: 0.1

  • sm.mem.reader.sparse_global_order.ratio_array_data Ratio of the budget allocated for array data in the sparse global order reader. Default: 0.1

  • sm.mem.reader.sparse_unordered_with_dups.ratio_coords Ratio of the budget allocated for coordinates in the sparse unordered with duplicates reader. Default: 0.5

  • sm.mem.reader.sparse_unordered_with_dups.ratio_tile_ranges Ratio of the budget allocated for tile ranges in the sparse unordered with duplicates reader. Default: 0.1

  • sm.mem.reader.sparse_unordered_with_dups.ratio_array_data Ratio of the budget allocated for array data in the sparse unordered with duplicates reader. Default: 0.1 The maximum byte size to read-ahead from the backend. Default: 102400

  • sm.group.timestamp_start The start timestamp used for opening the group. Default: 0

  • sm.group.timestamp_end

    The end timestamp used for opening the group.

    Also used for the write timestamp if set.

    Default: UINT64_MAX

  • sm.partial_tile_offsets_loading Experimental If true tile offsets can be partially loaded and unloaded by the readers. Default: false

  • sm.fragment_info.preload_mbrs If true MBRs will be loaded at the same time as the rest of fragment info, otherwise they will be loaded lazily when some info related to MBRs is requested by the user. Default: false

  • sm.partial_tile_offset_loading Experimental If true tile offsets can be partially loaded and unloaded by the readers. Default: false

  • ssl.ca_file

    The path to CA certificate to use when validating server certificates. Applies to all SSL/TLS connections.

    This option might be ignored on platforms that have native certificate stores like Windows.

    Default: “”

  • ssl.ca_path

    The path to a directory with CA certificates to use when validating server certificates. Applies to all SSL/TLS connections.

    This option might be ignored on platforms that have native certificate stores like Windows.

    Default: “”

  • ssl.verify

    Whether to verify the server’s certificate. Applies to all SSL/TLS connections.

    Disabling verification is insecure and should only used for testing purposes.

    Default: true

  • vfs.read_ahead_cache_size The the total maximum size of the read-ahead cache, which is an LRU. Default: 10485760

  • vfs.log_operations Enables logging all VFS operations in trace mode. Default: false

  • vfs.min_parallel_size The minimum number of bytes in a parallel VFS operation (except parallel S3 writes, which are controlled by vfs.s3.multipart_part_size). Default: 10MB

  • vfs.max_batch_size The maximum number of bytes in a VFS read operationDefault: 100MB

  • vfs.min_batch_size The minimum number of bytes in a VFS read operationDefault: 20MB

  • vfs.min_batch_gap The minimum number of bytes between two VFS read batches.Default: 500KB

  • vfs.read_logging_mode Log read operations at varying levels of verbosity.Default: “” Possible values:

    • An empty string disables read logging.

    • Log each fragment read.

    • Log each individual fragment file read.

    • Log all files read.

    • Log all files with offset and length parameters.

    • Log all files with offset and length parameters on every read, not just the first read. On large arrays the read cache may get large so this trades of RAM usage vs increased log verbosity.

  • vfs.file.posix_file_permissions Permissions to use for posix file system with file creation.Default: 644

  • vfs.file.posix_directory_permissions Permissions to use for posix file system with directory creation.Default: 755

  • vfs.azure.storage_account_name Set the name of the Azure Storage account to use. Default: “”

  • vfs.azure.storage_account_key Set the Shared Key to authenticate to Azure Storage. Default: “”

  • vfs.azure.storage_sas_token Set the Azure Storage SAS (shared access signature) token to use. If this option is set along with vfs.azure.blob_endpoint, the latter must not include a SAS token. Default: “”

  • vfs.azure.blob_endpoint

    Set the default Azure Storage Blob endpoint.

    If not specified, it will take a value of

    <account-name>.blob.core.windows.net, where <account-name> is the value of the vfs.azure.storage_account_name option. This means that at least one of these two options must be set (or both if shared key authentication is used). Default: “”

  • vfs.azure.block_list_block_size The block size (in bytes) used in Azure blob block list writes. Any uint64_t value is acceptable. Note: vfs.azure.block_list_block_size vfs.azure.max_parallel_ops bytes will be buffered before issuing block uploads in parallel. Default: “5242880”

  • vfs.azure.max_parallel_ops The maximum number of Azure backend parallel operations. Default: sm.io_concurrency_level

  • vfs.azure.use_block_list_upload Determines if the Azure backend can use chunked block uploads. Default: “true”

  • vfs.azure.max_retries The maximum number of times to retry an Azure network request. Default: 5

  • vfs.azure.retry_delay_ms The minimum permissible delay between Azure netwwork request retry attempts, in milliseconds. Default: 800

  • vfs.azure.max_retry_delay_ms The maximum permissible delay between Azure netwwork request retry attempts, in milliseconds. Default: 60000

  • vfs.gcs.project_id Set the GCS project ID to create new buckets to. Not required unless you are going to use the VFS to create buckets. Default: “”

  • vfs.gcs.service_account_key Experimental Set the JSON string with GCS service account key. Takes precedence over vfs.gcs.workload_identity_configuration if both are specified. If neither is specified, Application Default Credentials will be used. Default: “”

  • vfs.gcs.workload_identity_configuration Experimental Set the JSON string with Workload Identity Federation configuration. vfs.gcs.service_account_key takes precedence over this if both are specified. If neither is specified, Application Default Credentials will be used. Default: “”

  • vfs.gcs.impersonate_service_account Experimental Set the GCS service account to impersonate. A chain of impersonated accounts can be formed by specifying many service accounts, separated by a comma. Default: “”

  • vfs.gcs.multi_part_size The part size (in bytes) used in GCS multi part writes. Any uint64_t value is acceptable. Note: vfs.gcs.multi_part_size * vfs.gcs.max_parallel_ops bytes will be buffered before issuing part uploads in parallel. Default: “5242880”

  • vfs.gcs.max_parallel_ops The maximum number of GCS backend parallel operations. Default: sm.io_concurrency_level

  • vfs.gcs.use_multi_part_upload Determines if the GCS backend can use chunked part uploads. Default: “true”

  • vfs.gcs.request_timeout_ms The maximum amount of time to retry network requests to GCS. Default: “3000”

  • vfs.gcs.max_direct_upload_size The maximum size in bytes of a direct upload to GCS. Ignored if vfs.gcs.use_multi_part_upload is set to true. Default: “10737418240”

  • vfs.s3.region

    The S3 region, if S3 is enabled.

    If empty, the region will be determined by the AWS SDK using sources such as environment variables, profile configuration, or instance metadata.

    Default: “”

  • vfs.s3.aws_access_key_id Set the AWS_ACCESS_KEY_ID Default: “”

  • vfs.s3.aws_secret_access_key Set the AWS_SECRET_ACCESS_KEY Default: “”

  • vfs.s3.aws_session_token Set the AWS_SESSION_TOKEN Default: “”

  • vfs.s3.aws_role_arn Determines the role that we want to assume. Set the AWS_ROLE_ARN Default: “”

  • vfs.s3.aws_external_id Third party access ID to your resources when assuming a role. Set the AWS_EXTERNAL_ID Default: “”

  • vfs.s3.aws_load_frequency Session time limit when assuming a role. Set the AWS_LOAD_FREQUENCY Default: “”

  • vfs.s3.aws_session_name (Optional) session name when assuming a role. Can be used for tracing and bookkeeping. Set the AWS_SESSION_NAME Default: “”

  • vfs.s3.scheme The S3 scheme (http or https), if S3 is enabled. Default: https

  • vfs.s3.endpoint_override The S3 endpoint, if S3 is enabled. Default: “”

  • vfs.s3.use_virtual_addressing The S3 use of virtual addressing (true or false), if S3 is enabled. Default: true

  • vfs.s3.skip_init Skip Aws::InitAPI for the S3 layer (true or false) Default: false

  • vfs.s3.use_multipart_upload The S3 use of multi-part upload requests (true or false), if S3 is enabled. Default: true

  • vfs.s3.max_parallel_ops The maximum number of S3 backend parallel operations. Default: sm.io_concurrency_level

  • vfs.s3.multipart_part_size The part size (in bytes) used in S3 multipart writes. Any uint64_t value is acceptable. Note: vfs.s3.multipart_part_size * vfs.s3.max_parallel_ops bytes will be buffered before issuing multipart uploads in parallel. Default: 5MB

  • vfs.s3.ca_file Path to SSL/TLS certificate file to be used by cURL for for S3 HTTPS encryption. Follows cURL conventions: https://curl.haxx.se/docs/manpage.html Default: “”

  • vfs.s3.ca_path Path to SSL/TLS certificate directory to be used by cURL for S3 HTTPS encryption. Follows cURL conventions: https://curl.haxx.se/docs/manpage.html Default: “”

  • vfs.s3.connect_timeout_ms The connection timeout in ms. Any long value is acceptable. Default: 10800

  • vfs.s3.connect_max_tries The maximum tries for a connection. Any long value is acceptable. Default: 5

  • vfs.s3.connect_scale_factor The scale factor for exponential backoff when connecting to S3. Any long value is acceptable. Default: 25

  • vfs.s3.custom_headers.* (Optional) Prefix for custom headers on s3 requests. For each custom header, use “vfs.s3.custom_headers.header_key” = “header_value” Optional. No Default

  • vfs.s3.logging_level The AWS SDK logging level. This is a process-global setting. The configuration of the most recently constructed context will set process state. Log files are written to the process working directory. Default: “Off”

  • vfs.s3.request_timeout_ms The request timeout in ms. Any long value is acceptable. Default: 3000

  • vfs.s3.requester_pays The requester pays for the S3 access charges. Default: false

  • vfs.s3.proxy_host The S3 proxy host. Default: “”

  • vfs.s3.proxy_port The S3 proxy port. Default: 0

  • vfs.s3.proxy_scheme The S3 proxy scheme. Default: “http”

  • vfs.s3.proxy_username The S3 proxy username. Note: this parameter is not serialized by tiledb_config_save_to_file. Default: “”

  • vfs.s3.proxy_password The S3 proxy password. Note: this parameter is not serialized by tiledb_config_save_to_file. Default: “”

  • vfs.s3.verify_ssl Enable HTTPS certificate verification. Default: true””

  • vfs.s3.no_sign_request Make unauthenticated requests to s3. Default: false

  • vfs.s3.sse The server-side encryption algorithm to use. Supported non-empty values are “aes256” and “kms” (AWS key management service). Default: “”

  • vfs.s3.sse_kms_key_id The server-side encryption key to use if vfs.s3.sse == “kms” (AWS key management service). Default: “”

  • vfs.s3.storage_class The storage class to use for the newly uploaded S3 objects. The set of accepted values is found in the Aws::S3::Model::StorageClass enumeration. “NOT_SET” “STANDARD” “REDUCED_REDUNDANCY” “STANDARD_IA” “ONEZONE_IA” “INTELLIGENT_TIERING” “GLACIER” “DEEP_ARCHIVE” “OUTPOSTS” “GLACIER_IR” “SNOW” “EXPRESS_ONEZONE” Default: “NOT_SET”

  • vfs.s3.bucket_canned_acl Names of values found in Aws::S3::Model::BucketCannedACL enumeration. “NOT_SET” “private_” “public_read” “public_read_write” “authenticated_read” Default: “NOT_SET”

  • vfs.s3.object_canned_acl Names of values found in Aws::S3::Model::ObjectCannedACL enumeration. (The first 5 are the same as for “vfs.s3.bucket_canned_acl”.) “NOT_SET” “private_” “public_read” “public_read_write” “authenticated_read” (The following three items are found only in Aws::S3::Model::ObjectCannedACL.) “aws_exec_read” “owner_read” “bucket_owner_full_control” Default: “NOT_SET”

  • vfs.s3.config_source Force S3 SDK to only load config options from a set source. The supported options are auto (TileDB config options are considered first, then SDK-defined precedence: env vars, config files, ec2 metadata), config_files (forces SDK to only consider options found in aws config files), sts_profile_with_web_identity (force SDK to consider assume roles/sts from config files with support for web tokens, commonly used by EKS/ECS). Default: auto

  • vfs.s3.install_sigpipe_handler When set to true, the S3 SDK uses a handler that ignores SIGPIPE signals. Default: “true”

  • vfs.hdfs.name_node_uri Name node for HDFS. Default: “”

  • vfs.hdfs.username HDFS username. Default: “”

  • vfs.hdfs.kerb_ticket_cache_path HDFS kerb ticket cache path. Default: “”

  • config.env_var_prefix Prefix of environmental variables for reading configuration parameters. Default: “TILEDB_”

  • config.logging_level The logging level configured, possible values: “0”: fatal, “1”: error, “2”: warn, “3”: info “4”: debug, “5”: trace Default: “1” if &#8212;enable-verbose bootstrap flag is provided, “0” otherwise

  • config.logging_format The logging format configured (DEFAULT or JSON) Default: “DEFAULT”

  • rest.server_address URL for REST server to use for remote arrays. Default: “https://api.tiledb.com”

  • rest.server_serialization_format Serialization format to use for remote array requests (CAPNP or JSON). Default: “CAPNP”

  • rest.username Username for login to REST server. Default: “”

  • rest.password Password for login to REST server. Default: “”

  • rest.token Authentication token for REST server (used instead of username/password). Default: “”

  • rest.resubmit_incomplete If true, incomplete queries received from server are automatically resubmitted before returning to user control. Default: “true”

  • rest.ignore_ssl_validation Have curl ignore ssl peer and host validation for REST server. Default: false

  • rest.creation_access_credentials_name The name of the registered access key to use for creation of the REST server. Default: no default set

  • rest.retry_http_codes CSV list of http status codes to automatically retry a REST request for Default: “503”

  • rest.retry_count Number of times to retry failed REST requests Default: 25

  • rest.retry_initial_delay_ms Initial delay in milliseconds to wait until retrying a REST request Default: 500

  • rest.retry_delay_factor The delay factor to exponentially wait until further retries of a failed REST request Default: 1.25

  • rest.curl.retry_errors If true any curl requests that returned an error will be retried Default: true

  • rest.curl.verbose

    Set curl to run in verbose mode for REST requests

    curl will print to stdout with this option

    Default: false

  • rest.curl.tcp_keepalive Set curl to use TCP keepalive for REST requests Default: true

  • rest.load_metadata_on_array_open If true, array metadata will be loaded and sent to server together with the open array Default: true

  • rest.load_non_empty_domain_on_array_open If true, array non empty domain will be loaded and sent to server together with the open array Default: true

  • rest.load_enumerations_on_array_open If true, enumerations will be loaded for the latest array schema and sent to server together with the open array. Default: false

  • rest.load_enumerations_on_array_open_all_schemas If true, enumerations will be loaded for all array schemas and sent to server together with the open array. Default: false

  • rest.use_refactored_array_open If true, the new REST routes and APIs for opening an array will be used Default: true

  • rest.use_refactored_array_open_and_query_submit If true, the new REST routes and APIs for opening an array and submitting a query will be used Default: true

  • rest.curl.buffer_size Set curl buffer size for REST requests Default: 524288 (512KB)

  • rest.capnp_traversal_limit CAPNP traversal limit used in the deserialization of messages(bytes) Default: 2147483648 (2GB)

  • rest.custom_headers.* (Optional) Prefix for custom headers on REST requests. For each custom header, use “rest.custom_headers.header_key” = “header_value” Optional. No Default

  • rest.payer_namespace The namespace that should be charged for the request. Default: no default set

  • filestore.buffer_size Specifies the size in bytes of the internal buffers used in the filestore API. The size should be bigger than the minimum tile size filestore currently supports, that is currently 1024bytes. Default: 100MB

inline std::string get(const std::string &param) const

Get a parameter from the configuration by key.

Parameters:

param – Name of configuration parameter

Throws:

TileDBError – if the parameter does not exist

Returns:

Value of configuration parameter

inline bool contains(const std::string_view &param) const

Check if a configuration parameter exists.

Parameters:

param – Name of configuration parameter

Returns:

true if the parameter exists, false otherwise

inline impl::ConfigProxy operator[](const std::string &param)

Operator that enables setting parameters with [].

Example:

Config conf;
conf["vfs.s3.region"] = "us-east-1a";
conf["vfs.s3.use_virtual_addressing"] = "true";
Context ctx(conf);
Parameters:

param – Name of parameter to set

Returns:

“Proxy” object supporting assignment.

inline Config &unset(const std::string &param)

Resets a config parameter to its default value.

Parameters:

param – Name of parameter

Returns:

Reference to this Config instance

inline iterator begin(const std::string &prefix)

Iterate over params starting with a prefix.

Example:

tiledb::Config config;
for (auto it = config.begin("vfs"), ite = config.end(); it != ite; ++it) {
  std::string name = it->first, value = it->second;
}

Parameters:

prefix – Prefix to iterate over

Returns:

iterator

inline iterator begin()

Iterate over all params.

Example:

tiledb::Config config;
for (auto it = config.begin(), ite = config.end(); it != ite; ++it) {
  std::string name = it->first, value = it->second;
}

Returns:

iterator

inline iterator end()

End iterator.

Public Static Functions

static inline void free(tiledb_config_t *config)

Wrapper function for freeing a config C object.

Exceptions

struct TileDBError : public std::runtime_error

Exception indicating a TileDB error.

Subclassed by tiledb::AttributeError, tiledb::SchemaMismatch, tiledb::TypeError

struct TypeError : public tiledb::TileDBError

Exception indicating a mismatch between a static and runtime type

Subclassed by tiledb::FilterOptionTypeError< Expected, Actual >

struct SchemaMismatch : public tiledb::TileDBError

Exception indicating the requested operation does not match array schema

struct AttributeError : public tiledb::TileDBError

Error related to attributes

Dimension

class Dimension

Describes one dimension of an Array. The dimension consists of a type, lower and upper bound, and tile-extent describing the memory ordering. Dimensions are added to a Domain.

Example:

tiledb::Context ctx;
tiledb::Domain domain(ctx);
// Create a dimension with inclusive domain [0,1000] and tile extent 100.
domain.add_dimension(Dimension::create<int32_t>(ctx, "d", {{0, 1000}}, 100));

Note: as laid out in the Storage Format, the following Datatypes are not valid for Dimension: TILEDB_CHAR, TILEDB_BLOB, TILEDB_GEOM_WKB, TILEDB_GEOM_WKT, TILEDB_BOOL, TILEDB_STRING_UTF8, TILEDB_STRING_UTF16, TILEDB_STRING_UTF32, TILEDB_STRING_UCS2, TILEDB_STRING_UCS4, TILEDB_ANY

Public Functions

inline unsigned cell_val_num() const

Returns number of values of one cell on this dimension. For variable-sized dimensions returns TILEDB_VAR_NUM.

inline Dimension &set_cell_val_num(unsigned num)

Sets the number of values per coordinate.

inline FilterList filter_list() const

Returns a copy of the FilterList of the dimemnsion. To change the filter list, use set_filter_list().

inline Dimension &set_filter_list(const FilterList &filter_list)

Sets the dimension filter list, which is an ordered list of filters that will be used to process and/or transform the coordinate data (such as compression).

inline const std::string name() const

Returns the name of the dimension.

inline tiledb_datatype_t type() const

Returns the dimension datatype.

template<typename T>
inline std::pair<T, T> domain() const

Returns the domain of the dimension.

Template Parameters:

TDomain datatype

Returns:

Pair of [lower, upper] inclusive bounds.

inline std::string domain_to_str() const

Returns a string representation of the domain.

Throws:

TileDBError – if the domain cannot be stringified (TILEDB_ANY)

template<typename T>
inline T tile_extent() const

Returns the tile extent of the dimension.

inline std::string tile_extent_to_str() const

Returns a string representation of the extent.

Throws:

TileDBError – if the domain cannot be stringified (TILEDB_ANY)

inline std::shared_ptr<tiledb_dimension_t> ptr() const

Returns a shared pointer to the C TileDB dimension object.

TILEDB_DEPRECATED inline void dump(FILE *out = stdout) const

Dumps information about the dimension in an ASCII representation to an output.

Parameters:

out – (Optional) File to dump output to. Defaults to stdout.

Public Static Functions

template<typename T>
static inline Dimension create(const Context &ctx, const std::string &name, const std::array<T, 2> &domain, T extent)

Factory function for creating a new dimension with datatype T.

Example:

tiledb::Context ctx;
// Create a dimension with inclusive domain [0,1000] and tile extent 100.
auto dim = Dimension::create<int32_t>(ctx, "d", {{0, 1000}}, 100);

Template Parameters:

T – int, char, etc…

Parameters:
  • ctx – The TileDB context.

  • name – The dimension name.

  • domain – The dimension domain. A pair [lower,upper] of inclusive bounds.

  • extent – The tile extent on the dimension.

Returns:

A new Dimension object.

template<typename T>
static inline Dimension create(const Context &ctx, const std::string &name, const std::array<T, 2> &domain)

Factory function for creating a new dimension with datatype T and without specifying a tile extent.

Example:

tiledb::Context ctx;
// Create a dimension with inclusive domain [0,1000] and no tile extent.
auto dim = Dimension::create<int32_t>(ctx, "d", {{0, 1000}});

Template Parameters:

T – int, char, etc…

Parameters:
  • ctx – The TileDB context.

  • name – The dimension name.

  • domain – The dimension domain. A pair [lower,upper] of inclusive bounds.

Returns:

A new Dimension object.

static inline Dimension create(const Context &ctx, const std::string &name, tiledb_datatype_t datatype, const void *domain, const void *extent)

Factory function for creating a new dimension (non typechecked).

Parameters:
  • ctx – The TileDB context.

  • name – The dimension name.

  • datatype – The dimension datatype.

  • domain – The dimension domain. A pair [lower,upper] of inclusive bounds.

  • extent – The tile extent on the dimension.

Returns:

A new Dimension object.

Domain

class Domain

Represents the domain of an array.

A Domain defines the set of Dimension objects for a given array. The properties of a Domain derive from the underlying dimensions. A Domain is a component of an ArraySchema.

Example:

tiledb::Context ctx;
tiledb::Domain domain;

// Note the dimension bounds are inclusive.
auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10});
auto d2 = tiledb::Dimension::create<uint64_t>(ctx, "d2", {1, 10});
auto d3 = tiledb::Dimension::create<int>(ctx, "d3", {-100, 100});

domain.add_dimension(d1);
domain.add_dimension(d2); // Throws error, all dims must be same type
domain.add_dimension(d3);

domain.cell_num(); // (10 - -10 + 1) * (10 - 1 + 1) = 210 max cells
domain.type(); // TILEDB_INT32, determined from the dimensions
domain.rank(); // 2, d1 and d2

tiledb::ArraySchema schema(ctx, TILEDB_DENSE);
schema.set_domain(domain); // Set the array's domain

Note

The dimension can only be signed or unsigned integral types, as well as floating point for sparse array domains.

Public Functions

inline const Context &context() const

Returns the context that the attribute belongs to.

inline uint64_t cell_num() const

Returns the total number of cells in the domain. Throws an exception if the domain type is float32 or float64.

Throws:

TileDBError – if cell_num cannot be computed.

TILEDB_DEPRECATED inline void dump(FILE *out = stdout) const

Dumps the domain in an ASCII representation to an output.

Parameters:

out – (Optional) File to dump output to. Defaults to stdout.

inline tiledb_datatype_t type() const

Returns the domain type.

inline unsigned ndim() const

Returns the number of dimensions.

inline std::vector<Dimension> dimensions() const

Returns the current set of dimensions in the domain.

inline Dimension dimension(unsigned idx) const

Returns the dimensions with the given index.

inline Dimension dimension(const std::string &name) const

Returns the dimensions with the given name.

inline Domain &add_dimension(const Dimension &d)

Adds a new dimension to the domain.

Example:

tiledb::Context ctx;
tiledb::Domain domain;
auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10});
domain.add_dimension(d1);

Parameters:

dDimension to add

Returns:

Reference to this Domain

template<typename ...Args>
inline Domain &add_dimensions(Args... dims)

Adds multiple dimensions to the domain.

Example:

tiledb::Context ctx;
tiledb::Domain domain;
auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10});
auto d2 = tiledb::Dimension::create<int>(ctx, "d2", {1, 10});
auto d3 = tiledb::Dimension::create<int>(ctx, "d3", {-100, 100});
domain.add_dimensions(d1, d2, d3);

Template Parameters:

Args – Variadic dimension datatype

Parameters:

dims – Dimensions to add

Returns:

Reference to this Domain.

inline bool has_dimension(const std::string &name) const

Checks if the domain has a dimension of the given name.

Parameters:

name – Name of dimension to check for

Returns:

True if the domain has a dimension of the given name.

inline std::shared_ptr<tiledb_domain_t> ptr() const

Returns a shared pointer to the C TileDB domain object.

Attribute

class Attribute

Describes an attribute of an Array cell.

An attribute specifies a name and datatype for a particular value in each array cell. There are 3 supported attribute types:

  • Fundamental types, such as char, int, double, uint64_t, etc..

  • Fixed sized arrays: T[N] or std::array<T, N>, where T is a fundamental type

  • Variable length data: std::string, std::vector<T> where T is a fundamental type

Fixed-size array types using POD types like std::array<T, N> are internally converted to byte-array attributes. E.g. an attribute of type std::array<float, 3> will be created as an attribute of type TILEDB_CHAR with cell_val_num sizeof(std::array<float, 3>).

Therefore, for fixed-length attributes it is recommended to use C-style arrays instead, e.g. float[3] instead of std::array<float, 3>.

Example:

tiledb::Context ctx;
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3");

// Change compression scheme
tiledb::FilterList filters(ctx);
filters.add_filter({ctx, TILEDB_FILTER_BZIP2});
a1.set_filter_list(filters);

// Add attributes to a schema
tiledb::ArraySchema schema(ctx, TILEDB_DENSE);
schema.add_attributes(a1, a2, a3);

Public Functions

inline Attribute(const Context &ctx, const std::string &name, tiledb_datatype_t type)

Construct an attribute with a name and enumerated type. cell_val_num will be set to 1.

Parameters:
  • ctx – TileDB context

  • name – Name of attribute

  • type – Enumerated type of attribute

inline Attribute(const Context &ctx, const std::string &name, tiledb_datatype_t type, const FilterList &filter_list)

Construct an attribute with an enumerated type and given filter list.

inline std::string name() const

Returns the name of the attribute.

inline const Context &context() const

Returns the context that the attribute belongs to.

inline tiledb_datatype_t type() const

Returns the attribute datatype.

inline uint64_t cell_size() const

Returns the size (in bytes) of one cell on this attribute. For variable-sized attributes returns TILEDB_VAR_NUM.

Example:

tiledb::Context ctx;
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3");
auto a4 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a4");
a1.cell_size();    // Returns sizeof(int)
a2.cell_size();    // Variable sized attribute, returns TILEDB_VAR_NUM
a3.cell_size();    // Returns 3 * sizeof(float)
a4.cell_size();    // Stored as byte array, returns sizeof(char).

inline unsigned cell_val_num() const

Returns number of values of one cell on this attribute. For variable-sized attributes returns TILEDB_VAR_NUM.

Example:

tiledb::Context ctx;
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3");
auto a4 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a4");
a1.cell_val_num();   // Returns 1
a2.cell_val_num();   // Variable sized attribute, returns TILEDB_VAR_NUM
a3.cell_val_num();   // Returns 3
a4.cell_val_num();   // Stored as byte array, returns
                        sizeof(std::array<float, 3>).

inline Attribute &set_cell_val_num(unsigned num)

Sets the number of attribute values per cell. This is inferred from the type parameter of the Attribute::create<T>() function, but can also be set manually.

Example:

// a1 and a2 are equivalent:
auto a1 = Attribute::create<std::vector<int>>(...);
auto a2 = Attribute::create<int>(...);
a2.set_cell_val_num(TILEDB_VAR_NUM);

Parameters:

num – Cell val number to set.

Returns:

Reference to this Attribute

inline Attribute &set_fill_value(const void *value, uint64_t size)

Sets the default fill value for the input attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).

Applicable to var-sized attributes.

Example:

tiledb::Context ctx;

// Fixed-sized attribute
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
int32_t value = 0;
uint64_t size = sizeof(value);
a1.set_fill_value(&value, size);

// Var-sized attribute
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
std::string value("null");
a2.set_fill_value(value.c_str(), value.size());

Note

A call to cell_val_num sets the fill value of the attribute to its default. Therefore, make sure you invoke set_fill_value after deciding on the number of values this attribute will hold in each cell.

Note

For fixed-sized attributes, the input size should be equal to the cell size.

Parameters:
  • value – The fill value to set.

  • size – The fill value size in bytes.

inline void get_fill_value(const void **value, uint64_t *size)

Gets the default fill value for the input attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).

Applicable to both fixed-sized and var-sized attributes.

Example:

// Fixed-sized attribute
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
const int32_t* value;
uint64_t size;
a1.get_fill_value(&value, &size);

// Var-sized attribute
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
const char* value;
uint64_t size;
a2.get_fill_value(&value, &size);
Parameters:
  • value – A pointer to the fill value to get.

  • size – The size of the fill value to get.

inline Attribute &set_fill_value(const void *value, uint64_t size, uint8_t valid)

Sets the default fill value for the input, nullable attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).

Applicable to var-sized attributes.

Example:

tiledb::Context ctx;

// Fixed-sized attribute
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
a1.set_nullable(true);
int32_t value = 0;
uint64_t size = sizeof(value);
uint8_t valid = 0;
a1.set_fill_value(&value, size, valid);

// Var-sized attribute
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
a2.set_nullable(true);
std::string value("null");
uint8_t valid = 0;
a2.set_fill_value(value.c_str(), value.size(), valid);

Note

A call to cell_val_num sets the fill value of the attribute to its default. Therefore, make sure you invoke set_fill_value after deciding on the number of values this attribute will hold in each cell.

Note

For fixed-sized attributes, the input size should be equal to the cell size.

Parameters:
  • value – The fill value to set.

  • size – The fill value size in bytes.

  • valid – The validity fill value, zero for a null value and non-zero for a valid attribute.

inline void get_fill_value(const void **value, uint64_t *size, uint8_t *valid)

Gets the default fill value for the input attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).

Applicable to both fixed-sized and var-sized attributes.

Example:

// Fixed-sized attribute
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
a1.set_nullable(true);
const int32_t* value;
uint64_t size;
uint8_t valid;
a1.get_fill_value(&value, &size, &valid);

// Var-sized attribute
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
a2.set_nullable(true);
const char* value;
uint64_t size;
uint8_t valid;
a2.get_fill_value(&value, &size, &valid);
Parameters:
  • value – A pointer to the fill value to get.

  • size – The size of the fill value to get.

  • valid – The fill value validity to get.

inline bool variable_sized() const

Check if attribute is variable sized.

inline FilterList filter_list() const

Returns a copy of the FilterList of the attribute. To change the filter list, use set_filter_list().

Returns:

Copy of the attribute FilterList.

inline Attribute &set_filter_list(const FilterList &filter_list)

Sets the attribute filter list, which is an ordered list of filters that will be used to process and/or transform the attribute data (such as compression).

Parameters:

filter_listFilter list to set

Returns:

Reference to this Attribute

inline Attribute &set_nullable(bool nullable)

Sets the nullability of an attribute.

Example:

auto a1 = Attribute::create<int>(...);
a1.set_nullable(true);

Parameters:

nullable – Whether the attribute is nullable.

Returns:

Reference to this Attribute

inline bool nullable() const

Gets the nullability of an attribute.

Example:

auto a1 = Attribute::create<int>(...);
auto nullable = a1.nullable();

Returns:

Whether the attribute is nullable.

inline std::shared_ptr<tiledb_attribute_t> ptr() const

Returns the C TileDB attribute object pointer.

TILEDB_DEPRECATED inline void dump(FILE *out = stdout) const

Dumps information about the attribute in an ASCII representation to an output.

Parameters:

out – (Optional) File to dump output to. Defaults to stdout.

Public Static Functions

template<typename T>
static inline Attribute create(const Context &ctx, const std::string &name)

Factory function for creating a new attribute with datatype T.

Example:

tiledb::Context ctx;
auto a1 = tiledb::Attribute::create<int>(ctx, "a1");
auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2");
auto a3 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a3");
auto a4 = tiledb::Attribute::create<std::vector<double>>(ctx, "a4");
auto a5 = tiledb::Attribute::create<char[8]>(ctx, "a5");

Template Parameters:

T – Datatype of the attribute. Can either be arithmetic type, C-style array, std::string, std::vector, or any trivially copyable classes (defined by std::is_trivially_copyable).

Parameters:
  • ctx – The TileDB context.

  • name – The attribute name.

Returns:

A new Attribute object.

static inline Attribute create(const Context &ctx, const std::string &name, tiledb_datatype_t type)

Factory function taking the type as a tiledb_datatype_t variable.

template<typename T>
static inline Attribute create(const Context &ctx, const std::string &name, const FilterList &filter_list)

Factory function for creating a new attribute with datatype T and a FilterList.

Example:

tiledb::Context ctx;
tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE})
    .add_filter({ctx, TILEDB_FILTER_BZIP2});
auto a1 = tiledb::Attribute::create<int>(ctx, "a1", filter_list);

Template Parameters:

T – Datatype of the attribute. Can either be arithmetic type, C-style array, std::string, std::vector, or any trivially copyable classes (defined by std::is_trivially_copyable).

Parameters:
  • ctx – The TileDB context.

  • name – The attribute name.

  • filter_listFilterList to use for attribute

Returns:

A new Attribute object.

Array Schema

class ArraySchema : public tiledb::Schema

Schema describing an array.

The schema is an independent description of an array. A schema can be used to create multiple array’s, and stores information about its domain, cell types, and compression details. An array schema is composed of:

  • A Domain

  • A set of Attributes

  • Memory layout definitions: tile and cell

  • Compression details for Array level factors like offsets and coordinates

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); // Or
TILEDB_DENSE

// Create a Domain
tiledb::Domain domain(...);

// Create Attributes
auto a1 = tiledb::Attribute::create(...);

schema.set_domain(domain);
schema.add_attribute(a1);

// Specify tile memory layout
schema.set_tile_order(TILEDB_ROW_MAJOR);
// Specify cell memory layout within each tile
schema.set_cell_order(TILEDB_ROW_MAJOR);
schema.set_capacity(10); // For sparse, set capacity of each tile

// Create the array on persistent storage with the schema.
tiledb::Array::create("my_array", schema);

Public Functions

inline explicit ArraySchema(const Context &ctx, tiledb_array_type_t type)

Creates a new array schema.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);

Parameters:
  • ctx – TileDB context

  • typeArray type, sparse or dense.

inline ArraySchema(const Context &ctx, const std::string &uri)

Loads the schema of an existing array.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx, "s3://bucket-name/array-name");

Parameters:
  • ctx – TileDB context

  • uri – URI of array

inline ArraySchema(const Context &ctx, tiledb_array_schema_t *schema)

Loads the schema of an existing array with the input C array schema object.

Parameters:
  • ctx – TileDB context

  • schema – C API array schema object

TILEDB_DEPRECATED inline virtual void dump(FILE *out = stdout) const override

Dumps the array schema in an ASCII representation to an output.

Parameters:

out – (Optional) File to dump output to. Defaults to stdout.

inline tiledb_array_type_t array_type() const

Returns the array type.

inline uint64_t capacity() const

Returns the tile capacity.

inline ArraySchema &set_capacity(uint64_t capacity)

Sets the tile capacity.

Parameters:

capacity – The capacity of a sparse data tile. Note that sparse data tiles exist in sparse fragments, which can be created in sparse arrays only. For more details, see tutorials/tiling-sparse.html.

Returns:

Reference to this ArraySchema instance.

inline bool allows_dups() const

Returns true if the array allows coordinate duplicates.

inline ArraySchema &set_allows_dups(bool allows_dups)

Sets whether the array allows coordinate duplicates. It throws an exception in case it sets true to a dense array.

inline uint32_t version() const

Returns the version of the array schema object.

inline tiledb_layout_t tile_order() const

Returns the tile order.

inline ArraySchema &set_tile_order(tiledb_layout_t layout)

Sets the tile order.

Parameters:

layout – Tile order to set.

Returns:

Reference to this ArraySchema instance.

inline ArraySchema &set_order(const std::array<tiledb_layout_t, 2> &p)

Sets both the tile and cell orders.

Parameters:

layout – Pair of {tile order, cell order}

Returns:

Reference to this ArraySchema instance.

inline tiledb_layout_t cell_order() const

Returns the cell order.

inline ArraySchema &set_cell_order(tiledb_layout_t layout)

Sets the cell order.

Parameters:

layout – Cell order to set.

Returns:

Reference to this ArraySchema instance.

inline FilterList coords_filter_list() const

Returns a copy of the FilterList of the coordinates. To change the coordinate compressor, use set_coords_filter_list().

Returns:

Copy of the coordinates FilterList.

inline ArraySchema &set_coords_filter_list(const FilterList &filter_list)

Sets the FilterList for the coordinates, which is an ordered list of filters that will be used to process and/or transform the coordinate data (such as compression).

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE})
    .add_filter({ctx, TILEDB_FILTER_BZIP2});
schema.set_coords_filter_list(filter_list);

Parameters:

filter_listFilterList to use

Returns:

Reference to this ArraySchema instance.

inline FilterList offsets_filter_list() const

Returns a copy of the FilterList of the offsets. To change the offsets compressor, use set_offsets_filter_list().

Returns:

Copy of the offsets FilterList.

inline FilterList validity_filter_list() const

Returns a copy of the FilterList of the validity arrays. To change the validity compressor, use set_validity_filter_list().

Returns:

Copy of the validity FilterList.

inline ArraySchema &set_offsets_filter_list(const FilterList &filter_list)

Sets the FilterList for the offsets, which is an ordered list of filters that will be used to process and/or transform the offsets data (such as compression).

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_POSITIVE_DELTA})
    .add_filter({ctx, TILEDB_FILTER_LZ4});
schema.set_offsets_filter_list(filter_list);

Parameters:

filter_listFilterList to use

Returns:

Reference to this ArraySchema instance.

inline ArraySchema &set_validity_filter_list(const FilterList &filter_list)

Sets the FilterList for the validity arrays, which is an ordered list of filters that will be used to process and/or transform the validity data (such as compression).

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_POSITIVE_DELTA})
    .add_filter({ctx, TILEDB_FILTER_LZ4});
schema.set_validity_filter_list(filter_list);

Parameters:

filter_listFilterList to use

Returns:

Reference to this ArraySchema instance.

inline Domain domain() const

Returns a copy of the schema’s array Domain. To change the domain, use set_domain().

Returns:

Copy of the array Domain

inline ArraySchema &set_domain(const Domain &domain)

Sets the array domain.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
// Create a Domain
tiledb::Domain domain(...);
schema.set_domain(domain);

Parameters:

domainDomain to use

Returns:

Reference to this ArraySchema instance.

inline std::pair<uint64_t, uint64_t> timestamp_range()

Get timestamp range of schema.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
std::pair<uint64_t, uint64_t> timestamp_range = schema.timestamp_range();

Returns:

Timestamp range of this ArraySchema instance.

inline virtual ArraySchema &add_attribute(const Attribute &attr) override

Adds an Attribute to the array.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
schema.add_attribute(Attribute::create<int32_t>(ctx.ptr().get(),
"attr_name"));

Parameters:

attr – The Attribute to add

Returns:

Reference to this ArraySchema instance.

inline std::shared_ptr<tiledb_array_schema_t> ptr() const

Returns a shared pointer to the C TileDB domain object.

inline virtual void check() const override

Validates the schema.

Example:

tiledb::Context ctx;
tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
// Add domain, attributes, etc...

try {
  schema.check();
} catch (const tiledb::TileDBError& e) {
  std::cout << e.what() << "\n";
  exit(1);
}

Throws:

TileDBError – if the schema is incorrect or invalid.

inline virtual std::unordered_map<std::string, Attribute> attributes() const override

Gets all attributes in the array.

Returns:

Map of attribute name to copy of Attribute instance.

inline virtual Attribute attribute(const std::string &name) const override

Get a copy of an Attribute in the schema by name.

Parameters:

name – Name of attribute

Returns:

Attribute

inline virtual unsigned attribute_num() const override

Returns the number of attributes in the schema.

inline virtual Attribute attribute(unsigned int i) const override

Get a copy of an Attribute in the schema by index. Attributes are ordered the same way they were defined when constructing the array schema.

Parameters:

i – Index of attribute

Returns:

Attribute

inline bool has_attribute(const std::string &name) const

Checks if the schema has an attribute of the given name.

Parameters:

name – Name of attribute to check for

Returns:

True if the schema has an attribute of the given name.

Public Static Functions

static inline std::string to_str(tiledb_array_type_t type)

Returns the input array type in string format.

static inline std::string to_str(tiledb_layout_t layout)

Returns the input layout in string format.

Array

class Array

Class representing a TileDB array object.

An Array object represents array data in TileDB at some persisted location, e.g. on disk, in an S3 bucket, etc. Once an array has been opened for reading or writing, interact with the data through Query objects.

Example:

tiledb::Context ctx;

// Create an ArraySchema, add attributes, domain, etc.
tiledb::ArraySchema schema(...);

// Create empty array named "my_array" on persistent storage.
tiledb::Array::create("my_array", schema);

Public Functions

inline Array(const Context &ctx, const std::string &array_uri, tiledb_query_type_t query_type, const TemporalPolicy temporal_policy = {}, const EncryptionAlgorithm encryption_algorithm = {})

Constructor. This opens the array for the given query type. The destructor calls the close() method.

Example:

// Open the array for reading
tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
Parameters:
  • ctx – TileDB context.

  • array_uri – The array URI.

  • query_typeQuery type to open the array for.

  • temporal_policy – The TemporalPolicy with which to open the array.

  • encryption_algorithm – The EncryptionAlgorithm to set on the array.

inline Array(const Context &ctx, tiledb_array_t *carray, tiledb_config_t *config)

Constructor. This sets the array config.

Example:

tiledb::Context ctx;
tiledb_config_t* config;
Parameters:
  • ctx – TileDB context.

  • carray – The array.

  • config – The array’s config.

inline Array(const Context &ctx, tiledb_array_t *carray, bool own = true)

Constructor. Creates a TileDB Array instance wrapping the given pointer.

Parameters:
  • ctxtiledb::Context

  • own=true – If false, disables underlying cleanup upon destruction.

Throws:

TileDBError – if construction fails

inline ~Array()

Destructor; calls close().

inline bool is_open() const

Checks if the array is open.

inline std::string uri() const

Returns the array URI.

inline ArraySchema schema() const

Get the ArraySchema for the array.

inline std::shared_ptr<tiledb_array_t> ptr() const

Returns a shared pointer to the C TileDB array object.

inline void open(tiledb_query_type_t query_type)

Opens the array. The array is opened using a query type as input.

This is to indicate that queries created for this Array object will inherit the query type. In other words, Array objects are opened to receive only one type of queries. They can always be closed and be re-opened with another query type. Also there may be many different Array objects created and opened with different query types. For instance, one may create and open an array object array_read for reads and another one array_write for writes, and interleave creation and submission of queries for both these array objects.

Example:

// Open the array for writing
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_WRITE);
// Close and open again for reading.
array.close();
array.open(TILEDB_READ);

Parameters:

query_type – The type of queries the array object will be receiving.

Throws:

TileDBError – if the array is already open or other error occurred.

inline void open(tiledb_query_type_t query_type, uint64_t timestamp)

Opens the array. The array is opened using a query type as input.

See Array::open

inline void open(tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const std::string &encryption_key)

Opens the array. The array is opened using a query type as input.

See Array::open

inline void open(tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const std::string &encryption_key, uint64_t timestamp)

Opens the array. The array is opened using a query type as input.

See Array::open

inline void reopen()

Reopens the array (the array must be already open). This is useful when the array got updated after it got opened and the Array object got created. To sync-up with the updates, the user must either close the array and open with open(), or just use reopen() without closing. This function will be generally faster than the former alternative.

Note: reopening encrypted arrays does not require the encryption key.

Example:

// Open the array for reading
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
array.reopen();

Throws:

TileDBError – if the array was not already open or other error occurred.

inline void set_open_timestamp_start(uint64_t timestamp_start) const

Sets the inclusive starting timestamp when opening this array.

inline void set_open_timestamp_end(uint64_t timestamp_end) const

Sets the inclusive ending timestamp when opening this array.

inline uint64_t open_timestamp_start() const

Retrieves the inclusive starting timestamp.

inline uint64_t open_timestamp_end() const

Retrieves the inclusive ending timestamp.

inline void set_config(const Config &config) const

Sets the array config.

Pre:

The array must be closed.

inline Config config() const

Retrieves the array config.

inline void close()

Closes the array. The destructor calls this automatically if the underlying pointer is owned.

Example:

tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
array.close();

template<typename T>
inline std::vector<std::pair<std::string, std::pair<T, T>>> non_empty_domain()

Retrieves the non-empty domain from the array. This is the union of the non-empty domains of the array fragments.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
// Specify the domain type (example uint32_t)
auto non_empty = array.non_empty_domain<uint32_t>();
std::cout << "Dimension named " << non_empty[0].first << " has cells in ["
          << non_empty[0].second.first << ", " non_empty[0].second.second
          << "]" << std::endl;

Template Parameters:

TDomain datatype

Returns:

Vector of dim names with a {lower, upper} pair. Inclusive. Empty vector if the array has no data.

template<typename T>
inline std::pair<T, T> non_empty_domain(unsigned idx)

Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
// Specify the dimension type (example uint32_t)
auto non_empty = array.non_empty_domain<uint32_t>(0);

Template Parameters:

TDimension datatype

Parameters:

idx – The dimension index.

Returns:

The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.

template<typename T>
inline std::pair<T, T> non_empty_domain(const std::string &name)

Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
// Specify the dimension type (example uint32_t)
auto non_empty = array.non_empty_domain<uint32_t>("d1");

Template Parameters:

TDimension datatype

Parameters:

name – The dimension name.

Returns:

The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.

inline std::pair<std::string, std::string> non_empty_domain_var(unsigned idx)

Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments. Applicable only to var-sized dimensions.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
// Specify the dimension type (example uint32_t)
auto non_empty = array.non_empty_domain_var(0);

Parameters:

idx – The dimension index.

Returns:

The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.

inline std::pair<std::string, std::string> non_empty_domain_var(const std::string &name)

Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments. Applicable only to var-sized dimensions.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
// Specify the dimension type (example uint32_t)
auto non_empty = array.non_empty_domain_var("d1");

Parameters:

name – The dimension name.

Returns:

The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.

inline tiledb_query_type_t query_type() const

Returns the query type the array was opened with.

inline void put_metadata(const std::string &key, tiledb_datatype_t value_type, uint32_t value_num, const void *value)

It puts a metadata key-value item to an open array. The array must be opened in WRITE mode, otherwise the function will error out.

Note

The writes will take effect only upon closing the array.

Parameters:
  • key – The key of the metadata item to be added. UTF-8 encodings are acceptable.

  • value_type – The datatype of the value.

  • value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata.

  • value – The metadata value in binary form.

inline void delete_metadata(const std::string &key)

It deletes a metadata key-value item from an open array. The array must be opened in WRITE mode, otherwise the function will error out.

Note

The writes will take effect only upon closing the array.

Note

If the key does not exist, this will take no effect (i.e., the function will not error out).

Parameters:

key – The key of the metadata item to be deleted.

inline void get_metadata(const std::string &key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)

It gets a metadata key-value item from an open array. The array must be opened in READ mode, otherwise the function will error out.

Note

If the key does not exist, then value will be NULL.

Parameters:
  • key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.

  • value_type – The datatype of the value.

  • value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.

  • value – The metadata value in binary form.

inline bool has_metadata(const std::string &key, tiledb_datatype_t *value_type)

Checks if key exists in metadata from an open array. The array must be opened in READ mode, otherwise the function will error out.

Note

If the key does not exist, then value_type will not be modified.

Parameters:
  • key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.

  • value_type – The datatype of the value associated with the key (if any).

Returns:

true if the key exists, else false.

inline uint64_t metadata_num() const

Returns then number of metadata items in an open array. The array must be opened in READ mode, otherwise the function will error out.

inline void get_metadata_from_index(uint64_t index, std::string *key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)

It gets a metadata item from an open array using an index. The array must be opened in READ mode, otherwise the function will error out.

Parameters:
  • index – The index used to get the metadata.

  • key – The metadata key.

  • value_type – The datatype of the value.

  • value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.

  • value – The metadata value in binary form.

Public Static Functions

static inline void delete_array(const Context &ctx, const std::string &uri)

Deletes all data written to the array with the input uri.

Parameters:
  • ctx – TileDB context

  • uri – The Array’s URI

Post:

This is destructive; the array may not be reopened after delete.

static inline void delete_fragments(const Context &ctx, const std::string &uri, uint64_t timestamp_start, uint64_t timestamp_end)

Deletes the fragments written between the input timestamps of an array with the input uri.

Parameters:
  • ctx – TileDB context

  • uri – The URI of the fragments’ parent Array.

  • timestamp_start – The epoch start timestamp in milliseconds.

  • timestamp_end – The epoch end timestamp in milliseconds. Use UINT64_MAX for the current timestamp.

static inline void delete_fragments_list(const Context &ctx, const std::string &uri, const char *fragment_uris[], const size_t num_fragments)

Deletes the fragments with the input uris on an array with the input uri.

static inline void consolidate(const Context &ctx, const std::string &uri, Config *const config = nullptr)

Consolidates the fragments of an array into a single fragment.

You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).

Example:

tiledb::Array::consolidate(ctx, "s3://bucket-name/array-name");

Parameters:
  • ctx – TileDB context

  • array_uri – The URI of the TileDB array to be consolidated.

  • config – Configuration parameters for the consolidation.

static inline void consolidate(const Context &ctx, const std::string &array_uri, const char *fragment_uris[], const size_t num_fragments, Config *const config = nullptr)

Consolidates the fragments with the input uris into a single fragment.

You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).

Example:

const char* fragment_uris[2] = {
"__1712657401931_1712657401931_285cf8a0eff4df875a04cfbea96d5c00_21",
"__1712657401948_1712657401948_285cf8a0efdsafas6a5a04cfbesajads_21"};

tiledb::Array::consolidate(
    ctx,
    "s3://bucket-name/array-name",
     fragment_uris,
     2,
     config);

Parameters:
  • ctx – TileDB context

  • array_uri – The URI of the TileDB array to be consolidated.

  • fragment_uris – Fragment names of the fragments to consolidate. The names can be recovered using tiledb_fragment_info_get_fragment_name_v2.

  • num_fragments – The number of fragments to consolidate.

  • config – Configuration parameters for the consolidation.

static inline void vacuum(const Context &ctx, const std::string &uri, Config *const config = nullptr)

Cleans up the array, such as consolidated fragments and array metadata. Note that this will coarsen the granularity of time traveling (see docs for more information).

Example:

tiledb::Array::vacuum(ctx, "s3://bucket-name/array-name");

Parameters:
  • ctx – TileDB context

  • array_uri – The URI of the TileDB array to be vacuumed.

  • config – Configuration parameters for the vacuuming.

static inline void create(const Context &ctx, const std::string &uri, const ArraySchema &schema)

Creates a new TileDB array given an input schema.

Example:

tiledb::Array::create(ctx, "s3://bucket-name/array-name", schema);

Parameters:
  • ctx – The TileDB context.

  • uri – URI where array will be created.

  • schema – The array schema.

static inline void create(const std::string &uri, const ArraySchema &schema)

Creates a new TileDB array given an input schema.

To create the array, this function uses the context that was used to instantiate the schema. You are recommended to explicitly pass it with the overload that takes a context.

Example:

tiledb::Array::create("s3://bucket-name/array-name", schema);

Parameters:
  • uri – URI where array will be created.

  • schema – The array schema.

static inline ArraySchema load_schema(const Context &ctx, const std::string &uri)

Loads the array schema from an array.

Example:

auto schema = tiledb::Array::load_schema(ctx,
"s3://bucket-name/array-name");

Parameters:
  • ctx – The TileDB context.

  • uri – The array URI.

Returns:

The loaded ArraySchema object.

static inline ArraySchema load_schema_with_config(const Context &ctx, const Config &config, const std::string &uri)

Loads the array schema from an array. Options to load additional features are read from the optionally-provided config. See tiledb_array_schema_load_with_config.

Example:

tiledb::Config config;
config["rest.load_enumerations_on_array_open"] = "true";
auto schema = tiledb::Array::load_schema_with_config(ctx, config,
"s3://bucket-name/array-name");

Parameters:
  • ctx – The TileDB context.

  • config – The request for additional features.

  • uri – The array URI.

Returns:

The loaded ArraySchema object.

static inline tiledb_encryption_type_t encryption_type(const Context &ctx, const std::string &array_uri)

Gets the encryption type the given array was created with.

Example:

tiledb_encryption_type_t enc_type;
tiledb::Array::encryption_type(ctx, "s3://bucket-name/array-name",
   &enc_type);

Parameters:
  • ctx – TileDB context

  • array_uri – The URI of the TileDB array to be consolidated.

  • encryption_type – Set to the encryption type of the array.

static inline void consolidate_metadata(const Context &ctx, const std::string &uri, Config *const config = nullptr)

Consolidates the metadata of an array.

You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).

Example:

tiledb::Array::consolidate_metadata(ctx, "s3://bucket-name/array-name");

Parameters:
  • ctx – TileDB context

  • array_uri – The URI of the TileDB array whose metadata will be consolidated.

  • config – Configuration parameters for the consolidation.

static inline void upgrade_version(const Context &ctx, const std::string &array_uri, Config *const config = nullptr)

Upgrades an array to the latest format version.

Example:

tiledb::Array::upgrade_version(ctx, "array_name");

Parameters:
  • ctx – TileDB context

  • array_uri – The URI of the TileDB array to be upgraded.

  • config – Configuration parameters for the upgrade.

Query

class Query

Construct and execute read/write queries on a tiledb::Array.

See examples for more usage details.

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::Array array(ctx, "my_dense_array", TILEDB_WRITE);
Query query(ctx, array);
query.set_layout(TILEDB_GLOBAL_ORDER);
std::vector a1_data = {1, 2, 3};
query.set_data_buffer("a1", a1_data);
query.submit();
query.finalize();
array.close();

Public Types

enum class Status

The query or query attribute status.

Values:

enumerator FAILED

Query failed.

enumerator COMPLETE

Query completed (all data has been read)

enumerator INPROGRESS

Query is in progress

enumerator INCOMPLETE

Query completed (but not all data has been read)

enumerator UNINITIALIZED

Query not initialized.

enumerator INITIALIZED

Query initialized (strategy created).

Public Functions

inline Query(const Context &ctx, const Array &array, tiledb_query_type_t type)

Creates a TileDB query object.

The query type (read or write) must be the same as the type used to open the array object.

The storage manager also acquires a shared lock on the array. This means multiple read and write queries to the same array can be made concurrently (in TileDB, only consolidation requires an exclusive lock for a short period of time).

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::Array array(ctx, "my_array", TILEDB_WRITE);
tiledb::Query query(ctx, array, TILEDB_WRITE);
Parameters:
  • ctx – TileDB context

  • array – Open Array object

  • type – The TileDB query type

inline Query(const Context &ctx, const Array &array)

Creates a TileDB query object.

The query type (read or write) is inferred from the array object, which was opened with a specific query type.

The storage manager also acquires a shared lock on the array. This means multiple read and write queries to the same array can be made concurrently (in TileDB, only consolidation requires an exclusive lock for a short period of time).

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::Array array(ctx, "my_array", TILEDB_WRITE);
Query query(ctx, array);
// Equivalent to:
// Query query(ctx, array, TILEDB_WRITE);
Parameters:
  • ctx – TileDB context

  • array – Open Array object

inline std::shared_ptr<tiledb_query_t> ptr() const

Returns a shared pointer to the C TileDB query object.

inline tiledb_query_type_t query_type() const

Returns the query type (read or write).

inline Query &set_layout(tiledb_layout_t layout)

Sets the layout of the cells to be written or read.

Parameters:

layout – For a write query, this specifies the order of the cells provided by the user in the buffers. For a read query, this specifies the order of the cells that will be retrieved as results and stored in the user buffers. The layout can be one of the following:

  • TILEDB_COL_MAJOR: This means column-major order with respect to the subarray.

  • TILEDB_ROW_MAJOR: This means row-major order with respect to the subarray.

  • TILEDB_GLOBAL_ORDER: This means that cells are stored or retrieved in the array global cell order.

  • TILEDB_UNORDERED: This is applicable only to writes for sparse arrays, or for sparse writes to dense arrays. It specifies that the cells are unordered and, hence, TileDB must sort the cells in the global cell order prior to writing.

Returns:

Reference to this Query

inline tiledb_layout_t query_layout() const

Returns the layout of the query.

inline Query &set_condition(const QueryCondition &condition)

Sets the read query condition.

Note that only one query condition may be set on a query at a time. This overwrites any previously set query condition. To apply more than one condition at a time, use the QueryCondition::combine API to construct a single object.

Parameters:

condition – The query condition object.

Returns:

Reference to this Query

inline const Array &array()

Returns the array of the query.

inline Status query_status() const

Returns the query status.

inline bool has_results() const

Returns true if the query has results. Applicable only to read queries (it returns false for write queries).

inline Status submit()

Submits the query. Call will block until query is complete.

Note

finalize() must be invoked after finish writing in global layout (via repeated invocations of submit()), in order to flush any internal state. For the case of reads, if the returned status is TILEDB_INCOMPLETE, TileDB could not fit the entire result in the user’s buffers. In this case, the user should consume the read results (if any), optionally reset the buffers with set_data_buffer(), and then resubmit the query until the status becomes TILEDB_COMPLETED. If all buffer sizes after the termination of this function become 0, then this means that no useful data was read into the buffers, implying that the larger buffers are needed for the query to proceed. In this case, the users must reallocate their buffers (increasing their size), reset the buffers with set_data_buffer(), and resubmit the query.

Returns:

Query status

inline void finalize()

Flushes all internal state of a query object and finalizes the query. This is applicable only to global layout writes. It has no effect for any other query type.

inline void submit_and_finalize()

Submits and finalizes the last tile of a global order write. For remote TileDB arrays, this is optimized to use only one request to perform both the submit and finalize.

inline std::unordered_map<std::string, std::pair<uint64_t, uint64_t>> result_buffer_elements() const

Returns the number of elements in the result buffers from a read query. This is a map from the attribute name to a pair of values.

The first is number of elements (offsets) for var size attributes, and the second is number of elements in the data buffer. For fixed sized attributes (and coordinates), the first is always 0.

For variable sized attributes: the first value is the number of cells read, i.e. the number of offsets read for the attribute. The second value is the total number of elements in the data buffer. For example, a read query on a variable-length float attribute that reads three cells would return 3 for the first number in the pair. If the total amount of floats read across the three cells was 10, then the second number in the pair would be 10.

For fixed-length attributes, the first value is always 0. The second value is the total number of elements in the data buffer. For example, a read query on a single float attribute that reads three cells would return 3 for the second value. A read query on a float attribute with cell_val_num 2 that reads three cells would return 3 * 2 = 6 for the second value.

If the query has not been submitted, an empty map is returned.

Example:

// Submit a read query.
query.submit();
auto result_el = query.result_buffer_elements();

// For fixed-sized attributes, `.second` is the number of elements
// that were read for the attribute across all cells. Note: number of
// elements and not number of bytes.
auto num_a1_elements = result_el["a1"].second;

// Coords are also fixed-sized.
auto num_coords = result_el["__coords"].second;

// In variable attributes, e.g. std::string type, need two buffers,
// one for offsets and one for cell data ("elements").
auto num_a2_offsets = result_el["a2"].first;
auto num_a2_elements = result_el["a2"].second;

inline std::unordered_map<std::string, std::tuple<uint64_t, uint64_t, uint64_t>> result_buffer_elements_nullable() const

Returns the number of elements in the result buffers from a read query. This is a map from the attribute name to a tuple of values.

The first is number of elements (offsets) for var size attributes, and the second is number of elements in the data buffer. For fixed sized attributes (and coordinates), the first is always 0. The third element is the size of the validity bytemap buffer.

For variable sized attributes: the first value is the number of cells read, i.e. the number of offsets read for the attribute. The second value is the total number of elements in the data buffer. For example, a read query on a variable-length float attribute that reads three cells would return 3 for the first number in the pair. If the total amount of floats read across the three cells was 10, then the second number in the pair would be 10.

For fixed-length attributes, the first value is always 0. The second value is the total number of elements in the data buffer. For example, a read query on a single float attribute that reads three cells would return 3 for the second value. A read query on a float attribute with cell_val_num 2 that reads three cells would return 3 * 2 = 6 for the second value.

If the query has not been submitted, an empty map is returned.

Example:

// Submit a read query.
query.submit();
auto result_el = query.result_buffer_elements_nullable();

// For fixed-sized attributes, the second tuple element is the number of
// elements that were read for the attribute across all cells. Note: number
// of elements and not number of bytes.
auto num_a1_elements = std::get<1>(result_el["a1"]);

// In variable attributes, e.g. std::string type, need two buffers,
// one for offsets and one for cell data ("elements").
auto num_a2_offsets = std::get<0>(result_el["a2"]);
auto num_a2_elements = std::get<1>(result_el["a2"]);

// For both fixed-size and variable-sized attributes, the third tuple
// element is the number of elements in the validity bytemap.
auto num_a1_validity_values = std::get<2>(result_el["a1"]);
auto num_a2_validity_values = std::get<2>(result_el["a2"]);

inline uint64_t est_result_size(const std::string &attr_name) const

Retrieves the estimated result size for a fixed-size attribute. This is an estimate and may not be sufficient to read all results for the requested range, for sparse arrays or array with var-length attributes. Query status must be checked and resubmitted if not complete.

Example:

uint64_t est_size = query.est_result_size("attr1");
Parameters:

attr_name – The attribute name.

Returns:

The estimated size in bytes.

inline std::array<uint64_t, 2> est_result_size_var(const std::string &attr_name) const

Retrieves the estimated result size for a variable-size attribute. This is an estimate and may not be sufficient to read all results for the requested ranges, for sparse arrays or any array with var-length attributes. Query status must be checked and resubmitted if not complete.

Example:

std::array<uint64_t, 2> est_size =
    query.est_result_size_var("attr1");
Parameters:

attr_name – The attribute name.

Returns:

An array with first element containing the estimated size of the result offsets in bytes, and second element containing the estimated size of the result values in bytes.

inline std::array<uint64_t, 2> est_result_size_nullable(const std::string &attr_name) const

Retrieves the estimated result size for a fixed-size, nullable attribute. This is an estimate and may not be sufficient to read all results for the requested ranges, for sparse arrays or any array with var-length attributes. Query status must be checked and resubmitted if not complete.

Example:

std::array<uint64_t, 2> est_size =
   query.est_result_size_nullable("attr1");
Parameters:

attr_name – The attribute name.

Returns:

An array with first element containing the estimated size of the result values in bytes, and second element containing the estimated size of the result validity values in bytes.

inline std::array<uint64_t, 3> est_result_size_var_nullable(const std::string &attr_name) const

Retrieves the estimated result size for a variable-size, nullable attribute.

Example:

std::array<uint64_t, 3> est_size =
    query.est_result_size_var_nullable("attr1");
Parameters:

attr_name – The attribute name.

Returns:

An array with first element containing the estimated size of the offset values in bytes, second element containing the estimated size of the result values in bytes, and the third element containing the estimated size of the validity values in bytes.

inline uint32_t fragment_num() const

Returns the number of written fragments. Applicable only to WRITE queries.

inline std::string fragment_uri(uint32_t idx) const

Returns the URI of the written fragment with the input index. Applicable only to WRITE queries.

inline std::pair<uint64_t, uint64_t> fragment_timestamp_range(uint32_t idx) const

Returns the timestamp range of the written fragment with the input index. Applicable only to WRITE queries.

inline Query &set_subarray(const Subarray &subarray)

Prepare a query with the contents of a subarray.

Parameters:

subarray – The subarray to be used to prepare the query.

inline Query &set_config(const Config &config)

Set the query config.

Setting the query config will also set the subarray configuration in order to maintain existing behavior. If you wish the subarray to have a different configuration than the query, set it after calling Query::set_config.

Setting configuration with this function overrides the following Query-level parameters only:

  • sm.memory_budget

  • sm.memory_budget_var

  • sm.var_offsets.mode

  • sm.var_offsets.extra_element

  • sm.var_offsets.bitsize

  • sm.check_coord_dups

  • sm.check_coord_oob

  • sm.check_global_order

  • sm.dedup_coords

inline Config config() const

Get the config

Returns:

Config

template<typename T>
inline Query &set_data_buffer(const std::string &name, T *buff, uint64_t nelements)

Sets the data for a fixed/var-sized attribute/dimension.

The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds values to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain data read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_WRITE);
int data_a1[] = {0, 1, 2, 3};
Query query(ctx, array);
query.set_data_buffer("a1", data_a1, 4);

Note

set_data_buffer(std::string, std::vector) is preferred as it is safer.

Template Parameters:

T – Attribute/Dimension value type

Parameters:
  • name – Attribute/Dimension name

  • buff – Buffer array pointer with elements of the attribute/dimension type.

  • nelements – Number of array elements

template<typename T>
inline Query &set_data_buffer(const std::string &name, std::vector<T> &buf)

Sets the data for a fixed/var-sized attribute/dimension.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_WRITE);
std::vector<int> data_a1 = {0, 1, 2, 3};
Query query(ctx, array);
query.set_data_buffer("a1", data_a1);

Template Parameters:

T – Attribute/Dimension value type

Parameters:
  • name – Attribute/Dimension name

  • buf – Buffer vector with elements of the attribute/dimension type.

inline Query &set_data_buffer(const std::string &name, void *buff, uint64_t nelements)

Sets the data for a fixed/var-sized attribute/dimension.

The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds values to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain data read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.

Note

This unsafe version does not perform type checking; the given buffer is assumed to be the correct type, and the size of an element in the given buffer is assumed to be the size of the datatype of the attribute.

Parameters:
  • name – Attribute/Dimension name

  • buff – Buffer array pointer with elements of the attribute type.

  • nelements – Number of array elements in buffer

inline Query &set_data_buffer(const std::string &name, std::string &data)

Sets the data for a fixed/var-sized attribute/dimension.

Parameters:
  • name – Attribute/Dimension name

  • data – Pre-allocated string buffer.

inline Query &set_offsets_buffer(const std::string &attr, uint64_t *offsets, uint64_t offset_nelements)

Sets the offset buffer for a var-sized attribute/dimension.

The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds offsets to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain offset data read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_WRITE);
uint64_t offsets_a1[] = {0, 8};
Query query(ctx, array);
query.set_offsets_buffer("a1", offsets_a1, 2);

Note

set_offsets_buffer(std::string, std::vector, std::vector) is preferred as it is safer.

Parameters:
  • attr – Attribute/Dimension name

  • offsets – Offsets array pointer where a new element begins in the data buffer.

  • offsets_nelements – Number of elements in offsets buffer.

inline Query &set_offsets_buffer(const std::string &name, std::vector<uint64_t> &offsets)

Sets the offset buffer for a var-sized attribute/dimension.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_WRITE);
std::vector<uint64_t> offsets_a1 = {0, 8};
Query query(ctx, array);
query.set_offsets_buffer("a1", offsets_a1);

Parameters:
  • name – Attribute/Dimension name

  • offsets – Offsets where a new element begins in the data buffer.

inline Query &set_validity_buffer(const std::string &attr, uint8_t *validity_bytemap, uint64_t validity_bytemap_nelements)

Sets the validity buffer for nullable attribute/dimension.

The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds validity values to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain the validity map read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.

Template Parameters:

TAttribute value type

Parameters:
  • attrAttribute name

  • validity_bytemap – The validity bytemap buffer.

  • validity_bytemap_nelements – The number of values within validity_bytemap_nelements

inline Query &set_validity_buffer(const std::string &name, std::vector<uint8_t> &validity_bytemap)

Sets the validity buffer for nullable attribute/dimension.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_WRITE);
std::vector<uint8_t> validity_bytemap = {1, 1, 0, 1};
Query query(ctx, array);
query.set_validity_buffer("a1", validity_bytemap);

Parameters:
  • nameAttribute name

  • validity_bytemap – Buffer vector with elements of the attribute validity values.

inline Query &get_data_buffer(const std::string &name, void **data, uint64_t *data_nelements, uint64_t *element_size)

Retrieves the data buffer of a fixed/var-sized attribute/dimension.

Parameters:
  • name – Attribute/dimension name

  • data – Buffer array pointer with elements of the attribute type.

  • data_nelements – Number of array elements.

  • element_size – Size of array elements (in bytes).

inline Query &get_offsets_buffer(const std::string &name, uint64_t **offsets, uint64_t *offsets_nelements)

Retrieves the offset buffer for a var-sized attribute/dimension.

Parameters:
  • name – Attribute/dimension name

  • offsets – Offsets array pointer with elements of uint64_t type.

  • offsets_nelements – Number of array elements.

inline Query &get_validity_buffer(const std::string &name, uint8_t **validity_bytemap, uint64_t *validity_bytemap_nelements)

Retrieves the validity buffer for a nullable attribute/dimension.

Parameters:
  • nameAttribute name

  • validity_bytemap – Buffer array pointer with elements of the attribute validity values.

  • validity_bytemap_nelements – Number of validity bytemap elements.

inline std::string stats()

Returns a JSON-formatted string of the stats.

inline Query &update_subarray_from_query(Subarray *subarray)

Update the subarray data within the query from the subarray parameter.

Parameters:

subarray – The output subarray to receive this query’s subarray data.

Public Static Functions

static inline Status to_status(const tiledb_query_status_t &status)

Converts the TileDB C query status to a C++ query status.

static inline std::string to_str(tiledb_query_type_t type)

Converts the TileDB C query type to a string representation.

QueryCondition

class QueryCondition

Public Functions

inline QueryCondition(const Context &ctx)

Creates a TileDB query condition object.

Parameters:

ctx – TileDB context.

QueryCondition(const QueryCondition&) = default

Copy constructor.

QueryCondition(QueryCondition&&) = default

Move constructor.

~QueryCondition() = default

Destructor.

inline QueryCondition(const Context &ctx, tiledb_query_condition_t *const qc)

Constructs an instance directly from a C-API query condition object.

Parameters:
  • ctx – The TileDB context.

  • qc – The C-API query condition object.

QueryCondition &operator=(const QueryCondition&) = default

Copy-assignment operator.

QueryCondition &operator=(QueryCondition&&) = default

Move-assignment operator.

inline void init(const std::string &attribute_name, const void *condition_value, uint64_t condition_value_size, tiledb_query_condition_op_t op)

Initialize a TileDB query condition object.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "my_array", TILEDB_READ);
tiledb::Query query(ctx, array, TILEDB_READ);

int cmp_value = 5;
tiledb::QueryCondition qc;
qc.init("a1", &cmp_value, sizeof(int), TILEDB_LT);
query.set_condition(qc);
Parameters:
  • ctx – TileDB context.

  • attribute_name – The name of the attribute to compare against.

  • condition_value – The fixed value to compare against.

  • condition_value_size – The byte size of condition_value.

  • op – The comparison operation between each cell value and condition_value.

inline void init(const std::string &attribute_name, const std::string &condition_value, tiledb_query_condition_op_t op)

Initializes a TileDB query condition object.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, "my_array", TILEDB_READ);
tiledb::Query query(ctx, array, TILEDB_READ);

std::string cmp_value = "abc";
tiledb::QueryCondition qc;
qc.init("a1", cmp_value, TILEDB_LT);
query.set_condition(qc);
Parameters:
  • ctx – TileDB context.

  • attribute_name – The name of the attribute to compare against.

  • condition_value – The fixed value to compare against.

  • condition_value_size – The byte size of condition_value.

  • op – The comparison operation between each cell value and condition_value.

inline std::shared_ptr<tiledb_query_condition_t> ptr() const

Returns a shared pointer to the C TileDB query condition object.

inline QueryCondition combine(const QueryCondition &rhs, tiledb_query_condition_combination_op_t combination_op) const

Combines this instance with another instance to form a multi-clause condition object.

Example:

int qc1_cmp_value = 10;
tiledb::QueryCondition qc1;
qc1.init("a1", &qc1_cmp_value, sizeof(int), TILEDB_LT);
int qc2_cmp_value = 3;
tiledb::QueryCondition qc2;
qc.init("a1", &qc2_cmp_value, sizeof(int), TILEDB_GE);

tiledb::QueryCondition qc3 = qc1.combine(qc2, TILEDB_AND);
query.set_condition(qc3);
Parameters:
  • rhs – The right-hand-side query condition object.

  • combination_op – The logical combination operator that combines this instance with rhs.

inline QueryCondition negate() const

Return a query condition representing a negation of this query condition. Currently this is performed by applying De Morgan’s theorem recursively to the query condition’s internal representation.

Example:

int qc1_cmp_value = 10;
tiledb::QueryCondition qc1;
qc1.init("a1", &qc1_cmp_value, sizeof(int), TILEDB_LT);
tiledb::QueryCondition qc2 = qc1.negate();
query.set_condition(qc2);

Public Static Functions

static inline QueryCondition create(const Context &ctx, const std::string &attribute_name, const std::string &value, tiledb_query_condition_op_t op)

Factory function for creating a new query condition with a string datatype.

Example:

tiledb::Context ctx;
auto a1 = tiledb::QueryCondition::create(ctx, "a1", "foo", TILEDB_LE);

Template Parameters:

T – Datatype of the attribute. Can either be arithmetic type or string.

Parameters:
  • ctx – The TileDB context.

  • name – The attribute name.

  • value – The value to compare against.

  • op – The comparison operator.

Returns:

A new QueryCondition object.

template<typename T>
static inline QueryCondition create(const Context &ctx, const std::string &attribute_name, T value, tiledb_query_condition_op_t op)

Factory function for creating a new query condition with datatype T.

Example:

tiledb::Context ctx;
auto a1 = tiledb::QueryCondition::create<int>(ctx, "a1", 5, TILEDB_LE);
auto a2 = tiledb::QueryCondition::create<float>(ctx, "a3", 3.5,
  TILEDB_GT);
auto a3 = tiledb::QueryCondition::create<double>(ctx,
  "a4", 10.0, TILEDB_LT);

Template Parameters:

T – Datatype of the attribute. Can either be arithmetic type or string.

Parameters:
  • ctx – The TileDB context.

  • name – The attribute name.

  • value – The value to compare against.

  • op – The comparison operator.

Returns:

A new QueryCondition object.

Subarray

class Subarray

Construct and support manipulation of a possibly multiple-range subarray for optional use with Query object operations.

See examples for more usage details.

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::Array array(ctx, "my_dense_array", TILEDB_WRITE);
Query query(ctx, array);
std::vector a1_data = {1, 2, 3};
query.set_buffer("a1", a1_data);
tiledb::Subarray subarray(ctx, array);
subarray.set_layout(TILEDB_GLOBAL_ORDER);
std::vector<int32_t> subarray_indices = {1, 2};
subarray.add_range(0, subarray_indices[0], subarray_indices[1]);
query.set_subarray(subarray);
query.submit();
query.finalize();
array.close();

Public Functions

inline Subarray(const tiledb::Context &ctx, const tiledb::Array &array, bool coalesce_ranges = true)

Creates a TileDB Subarray object.

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::Array array(ctx, "my_array", TILEDB_WRITE);
tiledb::Subarray subarray(ctx, array);
Parameters:
  • ctx – TileDB context

  • array – Open Array object

  • coalesce_ranges – When enabled, ranges will attempt to coalesce with existing ranges as they are added.

inline Subarray &set_coalesce_ranges(bool coalesce_ranges)

Set the coalesce_ranges flag for the subarray.

inline Subarray &replace_subarray_data(tiledb_subarray_t *capi_subarray)

Replace/update -this- Subarray’s shared_ptr to data to reference the passed subarray.

Parameters:

capi_subarray – is a c_api subarray to be referenced by this cpp_api subarray entity.

template<class T>
inline Subarray &add_range(uint32_t dim_idx, T start, T end, T stride = 0)

Adds a 1D range along a subarray dimension index, in the form (start, end, stride). The datatype of the range must be the same as the dimension datatype.

Example:

// Set a 1D range on dimension 0, assuming the domain type is int64.
int64_t start = 10;
int64_t end = 20;
// Stride is optional
subarray.add_range(0, start, end);
Template Parameters:

T – The dimension datatype.

Parameters:
  • dim_idx – The index of the dimension to add the range to.

  • start – The range start to add.

  • end – The range end to add.

  • stride – The range stride to add.

Returns:

Reference to this Subarray.

template<class T>
inline Subarray &add_range(const std::string &dim_name, T start, T end, T stride = 0)

Adds a 1D range along a subarray dimension name, specified by its name, in the form (start, end, stride). The datatype of the range must be the same as the dimension datatype.

Example:

// Set a 1D range on dimension "rows", assuming the domain type is int64.
int64_t start = 10;
int64_t end = 20;
const std::string dim_name = "rows";
// Stride is optional
subarray.add_range(dim_name, start, end);
Template Parameters:

T – The dimension datatype.

Parameters:
  • dim_name – The name of the dimension to add the range to.

  • start – The range start to add.

  • end – The range end to add.

  • stride – The range stride to add.

Returns:

Reference to this Subarray.

inline Subarray &add_range(uint32_t dim_idx, const std::string &start, const std::string &end)

Adds a 1D string range along a subarray dimension index, in the form (start, end). Applicable only to variable-sized dimensions

Example:

// Set a 1D range on dimension 0, assuming the domain type is int64.
int64_t start = 10;
int64_t end = 20;
// Stride is optional
subarray.add_range(0, start, end);
Template Parameters:

T – The dimension datatype.

Parameters:
  • dim_idx – The index of the dimension to add the range to.

  • start – The range start to add.

  • end – The range end to add.

Returns:

Reference to this Subarray.

inline Subarray &add_range(const std::string &dim_name, const std::string &start, const std::string &end)

Adds a 1D string range along a subarray dimension name, in the form (start, end). Applicable only to variable-sized dimensions

Example:

// Set a 1D range on dimension "rows", assuming the domain type is int64.
int64_t start = 10;
int64_t end = 20;
const std::string dim_name = "rows";
// Stride is optional
subarray.add_range(dim_name, start, end);
Template Parameters:

T – The dimension datatype.

Parameters:
  • dim_name – The name of the dimension to add the range to.

  • start – The range start to add.

  • end – The range end to add.

Returns:

Reference to this Subarray.

template<typename T = uint64_t>
inline Subarray &set_subarray(const T *pairs, uint64_t size)

Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_READ);
int subarray_vals[] = {0, 3, 0, 3};
Subarray subarray(ctx, array);
subarray.set_subarray(subarray_vals, 4);

Note

set_subarray(std::vector<T>) is preferred as it is safer.

Note

The number of pairs passed should equal number of dimensions of the array associated with the subarray, or the number of elements in subarray_vals should equal that number of dimensions * 2.

Template Parameters:

T – Type of array domain.

Parameters:
  • pairsSubarray pointer defined as an array of [start, stop] values per dimension.

  • size – The number of subarray elements.

inline Subarray &set_config(const Config &config)

Set the subarray config.

Setting configuration with this function overrides the following Subarray-level parameters only:

  • sm.read_range_oob

template<typename Vec>
inline Subarray &set_subarray(const Vec &pairs)

Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_READ);
std::vector<int> subarray_vals = {0, 3, 0, 3};
Subarray subarray(ctx, array);
subarray.set_subarray(subarray_vals);

Template Parameters:

Vec – Vector datatype. Should always be a vector of the domain type.

Parameters:

pairs – The subarray defined as a vector of [start, stop] coordinates per dimension.

template<typename T = uint64_t>
inline Subarray &set_subarray(const std::initializer_list<T> &l)

Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.

Example:

tiledb::Context ctx;
tiledb::Array array(ctx, array_name, TILEDB_READ);
Subarray subarray(ctx, array);
subarray.set_subarray({0, 3, 0, 3});

Template Parameters:

T – Type of array domain.

Parameters:

pairs – List of [start, stop] coordinates per dimension.

template<typename T = uint64_t>
inline Subarray &set_subarray(const std::vector<std::array<T, 2>> &pairs)

Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive.

Note

set_subarray(std::vector) is preferred and avoids an extra copy.

Template Parameters:

T – Type of array domain.

Parameters:

pairs – The subarray defined as pairs of [start, stop] per dimension.

inline uint64_t range_num(unsigned dim_idx) const

Retrieves the number of ranges for a given dimension index.

Example:

unsigned dim_idx = 0;
uint64_t range_num = subarray.range_num(dim_idx);
Parameters:

dim_idx – The dimension index.

Returns:

The number of ranges.

inline uint64_t range_num(const std::string &dim_name) const

Retrieves the number of ranges for a given dimension name.

Example:

unsigned dim_name = "rows";
uint64_t range_num = subarray.range_num(dim_name);
Parameters:

dim_name – The dimension name.

Returns:

The number of ranges.

template<class T>
inline std::array<T, 3> range(unsigned dim_idx, uint64_t range_idx)

Retrieves a range for a given dimension index and range id. The template datatype must be the same as that of the underlying array.

Example:

unsigned dim_idx = 0;
unsigned range_idx = 0;
auto range = subarray.range<int32_t>(dim_idx, range_idx);
Template Parameters:

T – The dimension datatype.

Parameters:
  • dim_idx – The dimension index.

  • range_idx – The range index.

Returns:

A triplet of the form (start, end, stride).

template<class T>
inline std::array<T, 3> range(const std::string &dim_name, uint64_t range_idx)

Retrieves a range for a given dimension name and range id. The template datatype must be the same as that of the underlying array.

Example:

unsigned dim_name = "rows";
unsigned range_idx = 0;
auto range = subarray.range<int32_t>(dim_name, range_idx);
Template Parameters:

T – The dimension datatype.

Parameters:
  • dim_name – The dimension name.

  • range_idx – The range index.

Returns:

A triplet of the form (start, end, stride).

inline std::array<std::string, 2> range(unsigned dim_idx, uint64_t range_idx)

Retrieves a range for a given variable length string dimension index and range id.

Example:

unsigned dim_idx = 0;
unsigned range_idx = 0;
std::array<std::string, 2> range = subarray.range(dim_idx, range_idx);
Parameters:
  • dim_idx – The dimension index.

  • range_idx – The range index.

Returns:

A pair of the form (start, end).

inline std::array<std::string, 2> range(const std::string &dim_name, uint64_t range_idx)

Retrieves a range for a given variable length string dimension name and range id.

Example:

unsigned dim_name = "rows";
unsigned range_idx = 0;
std::array<std::string, 2> range = subarray.range(dim_name, range_idx);
Parameters:
  • dim_name – The dimension name.

  • range_idx – The range index.

Returns:

A pair of the form (start, end).

inline std::shared_ptr<tiledb_subarray_t> ptr() const

Returns the C TileDB subarray object.

inline const Array &array() const

Returns the array the subarray is associated with.

Filter

class Filter

Represents a filter. A filter is used to transform attribute data e.g. with compression, delta encoding, etc.

Example:

tiledb::Context ctx;
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
int level = 5;
f.set_option(TILEDB_COMPRESSION_LEVEL, &level);

Public Functions

inline Filter(const Context &ctx, tiledb_filter_type_t filter_type)

Creates a Filter of the given type.

Example:

tiledb::Context ctx;
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
Parameters:
  • ctx – TileDB context

  • filter_type – Enumerated type of filter

inline Filter(const Context &ctx, tiledb_filter_t *filter)

Creates a Filter with the input C object.

Parameters:
  • ctx – TileDB context

  • filter – C API filter object

inline std::shared_ptr<tiledb_filter_t> ptr() const

Returns a shared pointer to the C TileDB domain object.

template<typename T, typename std::enable_if_t<!std::is_pointer_v<T>, int> = 0>
inline Filter &set_option(tiledb_filter_option_t option, T value)

Sets an option on the filter. Options are filter dependent; this function throws an error if the given option is not valid for the given filter.

Example:

tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
f.set_option(TILEDB_COMPRESSION_LEVEL, 5);
Template Parameters:

T – Type of value of option to set.

Parameters:
  • option – Enumerated option to set.

  • value – Value of option to set.

Throws:
  • TileDBError – if the option cannot be set on the filter.

  • std::invalid_argument – if the option value is the wrong type.

Returns:

Reference to this Filter

inline Filter &set_option(tiledb_filter_option_t option, const void *value)

Sets an option on the filter. Options are filter dependent; this function throws an error if the given option is not valid for the given filter.

This version of set_option performs no type checks.

Example:

tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
int level = 5;
f.set_option(TILEDB_COMPRESSION_LEVEL, &level);

Note

set_option<T>(option, T value) is preferred as it is safer.

Parameters:
  • option – Enumerated option to set.

  • value – Value of option to set.

Throws:

TileDBError – if the option cannot be set on the filter.

Returns:

Reference to this Filter

template<typename T>
inline T get_option(tiledb_filter_option_t option)

Gets an option value from the filter.

Example:

tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
int32_t level = f.get_option(TILEDB_COMPRESSION_LEVEL);
// level == -1 (the default compression level)
Template Parameters:

T – Type of option value to get.

Parameters:

option – Enumerated option to get.

Throws:
  • TileDBError – if the option cannot be retrieved from the filter.

  • std::invalid_argument – if the option value is the wrong type.

Returns:

value Buffer that option value will be written to.

template<typename T, typename std::enable_if<std::is_arithmetic_v<T>>::type* = nullptr>
inline void get_option(tiledb_filter_option_t option, T *value)

Gets an option value from the filter.

Example:

tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
int32_t level;
f.get_option(TILEDB_COMPRESSION_LEVEL, &level);
// level == -1 (the default compression level)
Template Parameters:

T – Type of option value to get.

Parameters:
  • option – Enumerated option to get.

  • value – Buffer that option value will be written to.

Throws:
  • TileDBError – if the option cannot be retrieved from the filter.

  • std::invalid_argument – if the option value is the wrong type.

inline void get_option(tiledb_filter_option_t option, void *value)

Gets an option value from the filter.

This version of get_option performs no type checks.

Example:

tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
int32_t level;
f.get_option(TILEDB_COMPRESSION_LEVEL, &level);
// level == -1 (the default compression level)

Note

The buffer pointed to by value must be large enough to hold the option value.

Note

T value = get_option<T>(option) is preferred as it is safer.

Parameters:
  • option – Enumerated option to get.

  • value – Buffer that option value will be written to.

Throws:

TileDBError – if the option cannot be retrieved from the filter.

inline tiledb_filter_type_t filter_type() const

Gets the filter type of this filter.

Public Static Functions

static inline std::string to_str(tiledb_filter_type_t type)

Returns the input type in string format.

Filter List

class FilterList

Represents an ordered list of Filters used to transform attribute data.

Example:

tiledb::Context ctx;
tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE})
    .add_filter({ctx, TILEDB_FILTER_BZIP2});

Public Functions

inline FilterList(const Context &ctx)

Construct a FilterList.

Example:

tiledb::Context ctx;
tiledb::FilterList filter_list(ctx);
Parameters:

ctx – TileDB context

inline FilterList(const Context &ctx, tiledb_filter_list_t *filter_list)

Creates a FilterList with the input C object.

Parameters:
  • ctx – TileDB context

  • filter – C API filter list object

inline std::shared_ptr<tiledb_filter_list_t> ptr() const

Returns a shared pointer to the C TileDB domain object.

inline FilterList &add_filter(const Filter &filter)

Appends a filter to a filter list. Data is processed through each filter in the order the filters were added.

Example:

tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE})
    .add_filter({ctx, TILEDB_FILTER_BZIP2});
Parameters:

filter – The filter to add

Returns:

Reference to this FilterList

inline Filter filter(uint32_t filter_index) const

Returns a copy of the Filter in this list at the given index.

Example:

tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE})
    .add_filter({ctx, TILEDB_FILTER_BZIP2});
auto f = filter_list.filter(1);
// f.filter_type() == TILEDB_FILTER_BZIP2
Parameters:

filter_index – Index of filter to get

Throws:

TileDBError – if the index is out of range

Returns:

Filter

inline uint32_t max_chunk_size() const

Gets the maximum tile chunk size for the filter list.

Returns:

Maximum tile chunk size

inline uint32_t nfilters() const

Returns the number of filters in this filter list.

Example:

tiledb::FilterList filter_list(ctx);
filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE})
    .add_filter({ctx, TILEDB_FILTER_BZIP2});
uint32_t n = filter_list.nfilters();  // n == 2
Returns:

inline FilterList &set_max_chunk_size(uint32_t max_chunk_size)

Sets the maximum tile chunk size for the filter list.

Parameters:

max_chunk_size – Maximum tile chunk size to set

Returns:

Reference to this FilterList

Group

inline void tiledb::create_group(const Context &ctx, const std::string &group)

Creates a new group. A Group is a logical grouping of Objects on the storage system (a directory).

Parameters:
  • ctx – The TileDB context.

  • group – The group URI.

Returns:

void

Object Management

class Object

Represents a TileDB object: array, group, key-value (map), or none (invalid).

Public Types

enum class Type

The object type.

Values:

enumerator Array

TileDB array object.

enumerator Group

TileDB group object.

enumerator Invalid

Invalid or unknown object type.

Public Functions

inline std::string to_str() const

Returns a string representation of the object, including its type and URI.

inline Type type() const

Returns the object type.

inline std::string uri() const

Returns the object URI.

inline std::optional<std::string> name() const

Returns the object optional Name.

inline bool operator==(const Object &rhs) const

Compares configs for equality.

inline bool operator!=(const Object &rhs) const

Compares configs for inequality.

Public Static Functions

static inline Object object(const Context &ctx, const std::string &uri)

Gets an Object object that encapsulates the object type of the given path.

Parameters:
  • ctx – The TileDB context

  • uri – The path to the object.

Returns:

An object that contains the type along with the URI.

static inline void remove(const Context &ctx, const std::string &uri)

Deletes a TileDB object at the given URI from disk/persistent storage.

Parameters:
  • ctx – The TileDB context

  • uri – The path to the object to be removed.

static inline void move(const Context &ctx, const std::string &old_uri, const std::string &new_uri)

Moves/renames a TileDB object.

Parameters:
  • old_uri – The path to the old object.

  • new_uri – The path to the new object.

class ObjectIter

Enables listing TileDB objects in a directory or walking recursively an entire directory tree.

Example:

// List the TileDB objects in an S3 bucket.
tiledb::Context ctx;
tiledb::ObjectIter obj_it(ctx, "s3://bucket-name");
for (auto it = obj_it.begin(), ite = obj_it.end(); it != ite; ++it) {
  const tiledb::Object &obj = *it;
  std::cout << obj << std::endl;
}

Public Functions

inline explicit ObjectIter(Context &ctx, const std::string &root = ".")

Creates an object iterator. Unless set_recursive is invoked, this iterator will iterate only over the children of root. It will also retrieve only TileDB-related objects.

Example:

// List the TileDB objects in an S3 bucket.
tiledb::Context ctx;
tiledb::ObjectIter obj_it(ctx, "s3://bucket-name");
for (auto it = obj_it.begin(), ite = obj_it.end(); it != ite; ++it) {
  const tiledb::Object &obj = *it;
  std::cout << obj << std::endl;
}

Parameters:
  • ctx – The TileDB context.

  • root – The root directory where the iteration will begin.

inline void set_iter_policy(bool group, bool array)

Determines whether group, array and key-value objects will be iterated on during the walk. The default (if the function is not invoked) is true for all objects.

Parameters:
  • group – If true, groups will be considered.

  • array – If true, arrays will be considered.

inline void set_recursive(tiledb_walk_order_t walk_order = TILEDB_PREORDER)

Specifies that the iteration will be over all the directories in the tree rooted at root_.

Parameters:

walk_order – The walk order.

inline void set_non_recursive()

Disables recursive traversal.

inline iterator begin()

Returns an object iterator at the beginning of its iteration.

inline iterator end() const

Returns an object iterator at the end of its iteration.

Public Static Functions

static inline int obj_getter(const char *path, tiledb_object_t type, void *data)

Callback function to be used when invoking the C TileDB functions for walking through the TileDB objects in the root_ diretory. The function retrieves the visited object and stored it in the object vector obj_vec.

Parameters:
  • path – The path of a visited TileDB object

  • type – The type of the visited TileDB object.

  • data – To be casted to the vector where the visited object will be stored.

Returns:

If 1 then the walk should continue to the next object.

class iterator

The actual iterator implementation in this class.

struct ObjGetterData

Carries data to be passed to obj_getter.

VFS

class VFS

Implements a virtual filesystem that enables performing directory/file operations with a unified API on different filesystems, such as local posix/windows, HDFS, AWS S3, etc.

Public Types

using filebuf = impl::VFSFilebuf

Stream buffer for Tiledb VFS.

This is unbuffered; each read/write is directly dispatched to TileDB. As such it is recommended to issue fewer, larger, operations.

Example (write to file):

// Create the file buffer.
tiledb::Context ctx;
tiledb::VFS vfs(ctx);
tiledb::VFS::filebuf buff(vfs);

// Create new file, truncating it if it exists.
buff.open("file.txt", std::ios::out);
std::ostream os(&buff);
if (!os.good()) throw std::runtime_error("Error opening file");

std::string str = "This will be written to the file.";

os.write(str.data(), str.size());
// Alternatively:
// os << str;
os.flush();
buff.close();

Example (read from file):

// Create the file buffer.
tiledb::Context ctx;
tiledb::VFS vfs(ctx);
tiledb::VFS::filebuf buff(vfs);
std::string file_uri = "s3://bucket-name/file.txt";

buff.open(file_uri, std::ios::in);
std::istream is(&buff);
if (!is.good()) throw std::runtime_error("Error opening file);

// Read all contents from the file
std::string contents;
auto nbytes = vfs.file_size(file_uri);
contents.resize(nbytes);
vfs.read((char*)contents.data(), nbytes);

buff.close();

Public Functions

inline explicit VFS(const Context &ctx)

Constructor.

Parameters:

ctx – A TileDB context.

inline VFS(const Context &ctx, const Config &config)

Constructor.

Parameters:
  • ctx – TileDB context.

  • config – TileDB config.

inline void create_bucket(const std::string &uri) const

Creates an object store bucket with the input URI.

inline void remove_bucket(const std::string &uri) const

Deletes an object store bucket with the input URI.

inline bool is_bucket(const std::string &uri) const

Checks if an object store bucket with the input URI exists.

inline void empty_bucket(const std::string &bucket) const

Empty an object store bucket

inline bool is_empty_bucket(const std::string &bucket) const

Check if an object store bucket is empty

inline void create_dir(const std::string &uri) const

Creates a directory with the input URI.

inline bool is_dir(const std::string &uri) const

Checks if a directory with the input URI exists.

inline void remove_dir(const std::string &uri) const

Removes a directory (recursively) with the input URI.

inline bool is_file(const std::string &uri) const

Checks if a file with the input URI exists.

inline void remove_file(const std::string &uri) const

Deletes a file with the input URI.

inline uint64_t dir_size(const std::string &uri) const

Retrieves the size of a directory with the input URI.

inline std::vector<std::string> ls(const std::string &uri) const

Retrieves the children in directory uri. This function is non-recursive, i.e., it focuses in one level below uri.

inline uint64_t file_size(const std::string &uri) const

Retrieves the size of a file with the input URI.

inline void move_file(const std::string &old_uri, const std::string &new_uri) const

Renames a TileDB file from an old URI to a new URI.

inline void move_dir(const std::string &old_uri, const std::string &new_uri) const

Renames a TileDB directory from an old URI to a new URI.

inline void copy_file(const std::string &old_uri, const std::string &new_uri) const

Copies a TileDB file from an old URI to a new URI.

inline void copy_dir(const std::string &old_uri, const std::string &new_uri) const

Copies a TileDB directory from an old URI to a new URI.

inline void touch(const std::string &uri) const

Touches a file with the input URI, i.e., creates a new empty file.

inline const Context &context() const

Get the underlying context

inline std::shared_ptr<tiledb_vfs_t> ptr() const

Get the underlying tiledb object

inline Config config() const

Get the config

Public Static Functions

static inline int ls_getter(const char *path, void *data)

Callback function to be used when invoking the C TileDB function for getting the children of a URI. It simply adds path to vec (which is casted from data).

Parameters:
  • path – The path of a visited TileDB object

  • data – This will be casted to the vector that will store path.

Returns:

If 1 then the walk should continue to the next object.

Utils

LICENSE

The MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

Copyright

Copyright (c) 2017-2021 TileDB, Inc.

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

DESCRIPTION

Utils for C++ API.

namespace tiledb

Functions

template<typename T, typename E = typename std::vector<T>>
std::vector<E> group_by_cell(const std::vector<uint64_t> &offsets, const std::vector<T> &data, uint64_t num_offsets, uint64_t num_data)

Convert an (offset, data) vector pair into a single vector of vectors. Useful for “unpacking” variable-length attribute data from a read query result in offsets + data form to a vector of per-cell data.

The offsets must be given in units of bytes.

Example:

std::vector<uint64_t> offsets;
std::vector<char> data;
...
query.set_data_buffer("attr_name", data);
query.set_offsets_buffer("attr_name", offsets);
query.submit();
...
auto attr_results = query.result_buffer_elements()["attr_name"];

// cell_vals length will be equal to the number of cells read by the query.
// Each element is a std::vector<char> with each cell's data for "attr_name"
auto cell_vals =
  group_by_cell(offsets, data, attr_results.first, attr_results.second);

// Reconstruct a std::string value for the first cell:
std::string cell_val(cell_vals[0].data(), cell_vals[0].size());

Note

This function, and the other utility functions, copy all of the input data when constructing their return values. Thus, these may be expensive for large amounts of data.

Template Parameters:
  • T – Underlying attribute datatype

  • E – Cell type. usually std::vector<T> or std::string. Must be constructable by {std::vector<T>::iterator, std::vector<T>::iterator}

Parameters:
  • offsets – Offsets vector. This specifies the start offset in bytes of each cell in the data vector.

  • data – Data vector. Flat data buffer with cell contents.

  • num_offsets – Number of offset elements populated by query. If the entire buffer is to be grouped, pass offsets.size().

  • num_data – Number of data elements populated by query. If the entire buffer is to be grouped, pass data.size().

Returns:

std::vector<E>

template<typename T, typename E = typename std::vector<T>>
std::vector<E> group_by_cell(const std::pair<std::vector<uint64_t>, std::vector<T>> &buff, uint64_t num_offsets, uint64_t num_data)

Convert an (offset, data) vector pair into a single vector of vectors. Useful for “unpacking” variable-length attribute data from a read query result in offsets + data form to a vector of per-cell data.

The offsets must be given in units of bytes.

Example:

std::vector<uint64_t> offsets;
std::vector<char> data;
...
query.set_data_buffer("attr_name", data);
query.set_offsets_buffer("attr_name", offsets);
query.submit();
...
auto attr_results = query.result_buffer_elements()["attr_name"];

// cell_vals length will be equal to the number of cells read by the query.
// Each element is a std::vector<char> with each cell's data for "attr_name"
auto cell_vals =
  group_by_cell(std::make_pair(offsets, data),
                attr_results.first, attr_results.second);

// Reconstruct a std::string value for the first cell:
std::string cell_val(cell_vals[0].data(), cell_vals[0].size());

Template Parameters:
  • T – Underlying attribute datatype

  • E – Cell type. usually std::vector<T> or std::string. Must be constructable by {std::vector<T>::iterator, std::vector<T>::iterator}

Parameters:
  • buff – Pair of (offset_vec, data_vec) to be grouped.

  • num_offsets – Number of offset elements populated by query.

  • num_data – Number of data elements populated by query.

Returns:

std::vector<E>

template<typename T, typename E = typename std::vector<T>>
std::vector<E> group_by_cell(const std::vector<uint64_t> &offsets, const std::vector<T> &data)

Convert a generic (offset, data) vector pair into a single vector of vectors. The offsets must be given in units of bytes.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
std::vector<uint64_t> offsets = {0, 5};
auto grouped = group_by_cell<char, std::string>(offsets, buf);
// grouped.size() == 2
// grouped[0] == "abcde"
// grouped[1] == "fghi"

Template Parameters:
  • T – Underlying attribute datatype

  • E – Cell type. usually std::vector<T> or std::string. Must be constructable by {std::vector<T>::iterator, std::vector<T>::iterator}

Parameters:
  • offsets – Offsets vector

  • data – Data vector

Returns:

std::vector<E>

template<typename T, typename E = typename std::vector<T>>
std::vector<E> group_by_cell(const std::vector<T> &buff, uint64_t el_per_cell, uint64_t num_buff)

Convert a vector of elements into a vector of fixed-length vectors.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
auto grouped = group_by_cell(buf, 3, buf.size());
std::string grp1(grouped[0].begin(), grouped[0].end());  // "abc"
std::string grp2(grouped[1].begin(), grouped[1].end());  // "def"
std::string grp3(grouped[2].begin(), grouped[2].end());  // "ghi"

// Throws an exception because buf.size() is not divisible by 2:
// group_by_cell(buf, 2, buf.size());

Template Parameters:
  • T – Underlying attribute datatype

  • E – Cell type. usually std::vector<T> or std::string. Must be constructable by {std::vector<T>::iterator, std::vector<T>::iterator}

Parameters:
  • buff – Data buffer to group

  • el_per_cell – Number of elements per cell to group together

  • num_buff – Number of elements populated by query. To group whole buffer, pass buff.size().

Returns:

std::vector<E>

template<typename T, typename E = typename std::vector<T>>
std::vector<E> group_by_cell(const std::vector<T> &buff, uint64_t el_per_cell)

Convert a vector of elements into a vector of fixed-length vectors.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
auto grouped = group_by_cell(buf, 3);
std::string grp1(grouped[0].begin(), grouped[0].end());  // "abc"
std::string grp2(grouped[1].begin(), grouped[1].end());  // "def"
std::string grp3(grouped[2].begin(), grouped[2].end());  // "ghi"

// Throws an exception because buf.size() is not divisible by 2:
// group_by_cell(buf, 2);

Template Parameters:
  • T – Element type

  • E – Cell type. usually std::vector<T> or std::string. Must be constructable by {std::vector<T>::iterator, std::vector<T>::iterator}

Parameters:
  • buff – Data buffer to group

  • el_per_cell – Number of elements per cell to group together

Returns:

std::vector<E>

template<uint64_t N, typename T>
std::vector<std::array<T, N>> group_by_cell(const std::vector<T> &buff, uint64_t num_buff)

Convert a vector of elements into a vector of fixed-length arrays.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
auto grouped = group_by_cell<3>(buf, buf.size());
std::string grp1(grouped[0].begin(), grouped[0].end());  // "abc"
std::string grp2(grouped[1].begin(), grouped[1].end());  // "def"
std::string grp3(grouped[2].begin(), grouped[2].end());  // "ghi"

// Throws an exception because buf.size() is not divisible by 2:
// group_by_cell<2>(buf, buf.size());

Template Parameters:
  • N – Elements per cell

  • TArray element type

Parameters:
  • buff – Data buffer to group

  • num_buff – Number of elements in buff that were populated by the query.

Returns:

std::vector<std::array<T,N>>

template<uint64_t N, typename T>
std::vector<std::array<T, N>> group_by_cell(const std::vector<T> &buff)

Convert a vector of elements into a vector of fixed-length arrays.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
auto grouped = group_by_cell<3>(buf);
std::string grp1(grouped[0].begin(), grouped[0].end());  // "abc"
std::string grp2(grouped[1].begin(), grouped[1].end());  // "def"
std::string grp3(grouped[2].begin(), grouped[2].end());  // "ghi"

// Throws an exception because buf.size() is not divisible by 2:
// group_by_cell<2>(buf);

Template Parameters:
  • N – Elements per cell

  • TArray element type

Parameters:

buff – data buff to group

Returns:

std::vector<std::array<T,N>>

template<typename T, typename R = typename T::value_type>
std::pair<std::vector<uint64_t>, std::vector<R>> ungroup_var_buffer(const std::vector<T> &data)

Unpack a vector of variable sized attributes into a data and offset buffer. The offset buffer result is in units of bytes.

Example:

std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'};
// For the sake of example, group buf into groups of 3 elements:
auto grouped = group_by_cell(buf, 3);
// Ungroup into offsets, data pair.
auto p = ungroup_var_buffer(grouped);
auto offsets = p.first;  // {0, 3, 6}
auto data = p.second;   // {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}

Template Parameters:
  • T – Vector type. T::value_type is considered the underlying data element type. Should be vector or string.

  • RT::value_type, deduced

Parameters:

data – Data buffer to unpack

Returns:

pair where .first is the offset buffer, and .second is data buffer

template<typename V, typename T = typename V::value_type::value_type>
std::vector<T> flatten(const V &vec)

Convert a vector-of-vectors and flatten it into a single vector.

Example:

std::vector<std::string> v = {"a", "bb", "ccc"};
auto flat_v = flatten(v);
std::string s(flat_v.begin(), flat_v.end()); // "abbccc"

std::vector<std::vector<double>> d = {{1.2, 2.1}, {2.3, 3.2}, {3.4, 4.3}};
auto flat_d = flatten(d);  // {1.2, 2.1, 2.3, 3.2, 3.4, 4.3};

Template Parameters:
  • V – Container type

  • T – Return element type

Parameters:

vec – Vector to flatten

Returns:

std::vector<T>

namespace impl

Functions

inline void check_config_error(tiledb_error_t *err)

Check an error, free, and throw if there is one.

Version

inline std::tuple<int, int, int> tiledb::version()

Get the Major, Minor, and Patch version.

Stats

class Stats

Encapsulates functionality related to internal TileDB statistics.

Example:

// Enable stats, submit a query, then dump to stdout.
tiledb::Stats::enable();
query.submit();
tiledb::Stats::dump();

// Dump to a string instead.
std::string str;
tiledb::Stats::dump(&str);

Public Static Functions

static inline void enable()

Enables internal TileDB statistics gathering.

static inline void disable()

Disables internal TileDB statistics gathering.

static inline bool is_enabled()

Returns whether internal statistics gathering is enabled.

Returns:

true if statistics gathering is enabled and false otherwise.

static inline void reset()

Reset all internal statistics counters to 0.

static inline void dump(FILE *out = nullptr)

Dump all statistics counters to some output (e.g., file or stdout).

Parameters:

out – The output.

static inline void dump(std::string *out)

Dump all statistics counters to a string.

Parameters:

out – The output.

static inline void raw_dump(FILE *out = nullptr)

Dump all raw statistics counters to some output (e.g., file or stdout) as a JSON.

Parameters:

out – The output.

static inline void raw_dump(std::string *out)

Dump all raw statistics counters to a string.

Parameters:

out – The output.

FragmentInfo

class FragmentInfo

Describes fragment info objects.

Public Functions

inline void load() const

Loads the fragment info.

inline std::string fragment_uri(uint32_t fid) const

Returns the URI of the fragment with the given index.

inline std::string fragment_name(uint32_t fid) const

Returns the name of the fragment with the given index.

inline const Context &context() const

Returns the context that the fragment info belongs to.

inline void get_non_empty_domain(uint32_t fid, uint32_t did, void *domain) const

Retrieves the non-empty domain of the fragment with the given index on the given dimension index.

inline void get_non_empty_domain(uint32_t fid, const std::string &dim_name, void *domain) const

Retrieves the non-empty domain of the fragment with the given index on the given dimension name.

inline std::pair<std::string, std::string> non_empty_domain_var(uint32_t fid, uint32_t did) const

Returns the non-empty domain of the fragment with the given index on the given dimension index. Applicable to string dimensions.

inline std::pair<std::string, std::string> non_empty_domain_var(uint32_t fid, const std::string &dim_name) const

Returns the non-empty domain of the fragment with the given index on the given dimension name. Applicable to string dimensions.

inline uint64_t mbr_num(uint32_t fid) const

Returns the number of MBRs in the fragment with the given index.

inline void get_mbr(uint32_t fid, uint32_t mid, uint32_t did, void *mbr) const

Retrieves the MBR of the fragment with the given index on the given dimension index.

inline void get_mbr(uint32_t fid, uint32_t mid, const std::string &dim_name, void *mbr) const

Retrieves the MBR of the fragment with the given index on the given dimension name.

inline std::pair<std::string, std::string> mbr_var(uint32_t fid, uint32_t mid, uint32_t did) const

Returns the MBR of the fragment with the given index on the given dimension index. Applicable to string dimensions.

inline std::pair<std::string, std::string> mbr_var(uint32_t fid, uint32_t mid, const std::string &dim_name) const

Returns the MBR of the fragment with the given index on the given dimension name. Applicable to string dimensions.

inline uint32_t fragment_num() const

Returns the number of fragments.

inline uint64_t fragment_size(uint32_t fid) const

Returns the size of the fragment with the given index.

inline bool dense(uint32_t fid) const

Returns true if the fragment with the given index is dense.

inline bool sparse(uint32_t fid) const

Returns true if the fragment with the given index is sparse.

inline std::pair<uint64_t, uint64_t> timestamp_range(uint32_t fid) const

Returns the timestamp range of the fragment with the given index.

inline uint64_t cell_num(uint32_t fid) const

Returns the number of cells of the fragment with the given index.

inline uint64_t total_cell_num() const

Returns the total number of cells written in the loaded fragments.

inline uint32_t version(uint32_t fid) const

Returns the version of the fragment with the given index.

inline ArraySchema array_schema(uint32_t fid) const

Returns the array schema of the fragment with the given index.

inline std::string array_schema_name(uint32_t fid) const

Returns the array schema name of the fragment with the given index.

inline bool has_consolidated_metadata(uint32_t fid) const

Returns true if the fragment with the given index has consolidated metadata.

inline uint32_t unconsolidated_metadata_num() const

Returns the number of fragments with unconsolidated metadata.

inline uint32_t to_vacuum_num() const

Returns the number of fragments to vacuum.

inline std::string to_vacuum_uri(uint32_t fid) const

Returns the URI of the fragment to vacuum with the given index.

TILEDB_DEPRECATED inline void dump(FILE *out = nullptr) const

Dumps the fragment info in an ASCII representation to an output.

Parameters:

out – (Optional) File to dump output to. Defaults to nullptr which will lead to selection of stdout.

inline std::shared_ptr<tiledb_fragment_info_t> ptr() const

Returns the C TileDB context object.

Experimental

class ArraySchemaEvolution

Evolve the schema on a tiledb::Array.

See examples for more usage details.

Example:

// Open the array for writing
tiledb::Context ctx;
tiledb::ArraySchemaEvolution evolution(ctx);
evolution.drop_attribute("a1");
evolution.array_evolve("my_test_array");

Public Functions

inline ArraySchemaEvolution(const Context &context, tiledb_array_schema_evolution_t *evolution)

Constructs the array schema evolution with the input C array array schema evolution object.

Parameters:
  • ctx – TileDB context

  • evolution – C API array schema evolution object

inline ArraySchemaEvolution(const Context &context)

Constructs an array schema evolution object.

Parameters:

ctx – TileDB context

inline ArraySchemaEvolution &add_attribute(const Attribute &attr)

Adds an Attribute to the array schema evolution.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
schema_evolution.add_attribute(Attribute::create<int32_t>(ctx,
"attr_name"));

Parameters:

attr – The Attribute to add

Returns:

Reference to this ArraySchemaEvolution instance.

inline ArraySchemaEvolution &drop_attribute(const std::string &attribute_name)

Drops an attribute.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
schema_evolution.drop_attribute("attr_name");

Parameters:

attr – The attribute to be dropped

Returns:

Reference to this ArraySchemaEvolution instance.

inline ArraySchemaEvolution &add_enumeration(const Enumeration &enmr)

Adds an Enumeration to the array schema evolution.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
std::vector<std::string> values = {"red", "green", "blue"};
schema_evolution.add_enumeration(Enumeration::create(ctx, "an_enumeration",
values));

Parameters:

enmr – The Enumeration to add.

Returns:

Reference to this ArraySchemaEvolution instance.

inline ArraySchemaEvolution &extend_enumeration(const Enumeration &enmr)

Extends an Enumeration during array schema evolution.

Example:

tiledb::Context ctx;
tiledb::Enumeration old_enmr = array->get_enumeration("some_enumeration");
std::vector<std::string> new_values = {"cyan", "magenta", "mauve"};
tiledb::Enumeration new_enmr = old_enmr->extend(new_values);
tiledb::ArraySchemaEvolution schema_evolution(ctx);
schema_evolution.extend_enumeration(new_enmr);

Parameters:

enmr – The Enumeration to extend.

Returns:

Reference to this ArraySchemaEvolution instance.

inline ArraySchemaEvolution &drop_enumeration(const std::string &enumeration_name)

Drops an enumeration.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
schema_evolution.drop_enumeration("enumeration_name");

Parameters:

enumeration_name – The enumeration to be dropped

Returns:

Reference to this ArraySchemaEvolution instance.

inline ArraySchemaEvolution &expand_current_domain(const CurrentDomain &expanded_domain)

Expands the current domain during array schema evolution. TileDB will enforce that the new current domain is expanding on the current one and not contracting during tiledb_array_evolve.

Parameters:

expanded_domain – The current domain we want to expand the schema to.

inline void set_timestamp_range(const std::pair<uint64_t, uint64_t> &timestamp_range)

Sets timestamp range.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
uint64_t now = tiledb_timestamp_now_ms()
schema_evolution.set_timestamp_range({now, now});

Parameters:

timestamp_range – The timestamp range to be set

inline ArraySchemaEvolution &array_evolve(const std::string &array_uri)

Evolves the schema of an array.

Example:

tiledb::Context ctx;
tiledb::ArraySchemaEvolution schema_evolution(ctx);
schema_evolution.drop_attribute("attr_name");
schema_evolution.array_evolve("test_array_uri");

Parameters:

array_uri – The uri of an array

Returns:

Reference to this ArraySchemaEvolution instance.

inline std::shared_ptr<tiledb_array_schema_evolution_t> ptr() const

Returns a shared pointer to the C TileDB array schema evolution object.

class Group

Public Functions

inline Group(const Context &ctx, const std::string &group_uri, tiledb_query_type_t query_type)

Constructor. Opens the group for the given query type. The destructor calls the close() method.

Example:

// Open the group for reading
tiledb::Context ctx;
tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_READ);
Parameters:
  • ctx – TileDB context.

  • group_uri – The group URI.

  • query_typeQuery type to open the group for.

inline Group(const Context &ctx, const std::string &group_uri, tiledb_query_type_t query_type, const Config &config)

Constructor. Sets a config to the group and opens it for the given query type. The destructor calls the close() method.

Example:

// Open the group for reading
tiledb::Context ctx;
tiledb::Config cfg;
cfg["rest.username"] = "user";
cfg["rest.password"] = "pass";
tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_READ, cfg);
Parameters:
  • ctx – TileDB context.

  • group_uri – The group URI.

  • query_typeQuery type to open the group for.

  • config – COnfiguration parameters

inline ~Group()

Destructor; calls close().

inline void open(tiledb_query_type_t query_type)

Opens the group using a query type as input.

This is to indicate that queries created for this Group object will inherit the query type. In other words, Group objects are opened to receive only one type of queries. They can always be closed and be re-opened with another query type. Also there may be many different Group objects created and opened with different query types. For instance, one may create and open an group object group_read for reads and another one group_write for writes, and interleave creation and submission of queries for both these group objects.

Example:

// Open the group for writing
tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_WRITE);
// Close and open again for reading.
group.close();
group.open(TILEDB_READ);

Parameters:

query_type – The type of queries the group object will be receiving.

Throws:

TileDBError – if the group is already open or other error occurred.

inline void set_config(const Config &config) const

Sets the group config.

Pre:

The group must be closed.

inline Config config() const

Retrieves the group config.

inline void close(bool should_throw = true)

Closes the group. This must be called directly if you wish to check that any changes to the group were committed. This is automatically called by the destructor but any errors encountered are logged instead of throwing an exception from a destructor.

Example:

tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_READ);
group.close();

inline bool is_open() const

Checks if the group is open.

inline std::string uri() const

Returns the group URI.

inline tiledb_query_type_t query_type() const

Returns the query type the group was opened with.

inline void put_metadata(const std::string &key, tiledb_datatype_t value_type, uint32_t value_num, const void *value)

Puts a metadata key-value item to an open group. The group must be opened in WRITE mode, otherwise the function will error out.

Note

The writes will take effect only upon closing the group.

Parameters:
  • key – The key of the metadata item to be added. UTF-8 encodings are acceptable.

  • value_type – The datatype of the value.

  • value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata.

  • value – The metadata value in binary form.

inline void delete_group(const std::string &uri, bool recursive = false)

Deletes all written data from an open group. The group must be opened in MODIFY_EXCLUSIVE mode, otherwise the function will error out.

Note

if recursive == false, data added to the group will be left as-is.

Parameters:
  • uri – The address of the group item to be deleted.

  • recursive – True if all data inside the group is to be deleted.

Post:

This is destructive; the group may not be reopened after delete.

inline void delete_metadata(const std::string &key)

Deletes a metadata key-value item from an open group. The group must be opened in WRITE mode, otherwise the function will error out.

Note

The writes will take effect only upon closing the group.

Note

If the key does not exist, this will take no effect (i.e., the function will not error out).

Parameters:

key – The key of the metadata item to be deleted.

inline void get_metadata(const std::string &key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)

Gets a metadata key-value item from an open group. The group must be opened in READ mode, otherwise the function will error out.

Note

If the key does not exist, then value will be NULL.

Parameters:
  • key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.

  • value_type – The datatype of the value.

  • value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.

  • value – The metadata value in binary form.

inline bool has_metadata(const std::string &key, tiledb_datatype_t *value_type)

Checks if key exists in metadata from an open group. The group must be opened in READ mode, otherwise the function will error out.

Note

If the key does not exist, then value_type will not be modified.

Parameters:
  • key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.

  • value_type – The datatype of the value associated with the key (if any).

Returns:

true if the key exists, else false.

inline uint64_t metadata_num() const

Returns then number of metadata items in an open group. The group must be opened in READ mode, otherwise the function will error out.

inline void get_metadata_from_index(uint64_t index, std::string *key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)

Gets a metadata item from an open group using an index. The group must be opened in READ mode, otherwise the function will error out.

Parameters:
  • index – The index used to get the metadata.

  • key – The metadata key.

  • value_type – The datatype of the value.

  • value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.

  • value – The metadata value in binary form.

inline void add_member(const std::string &uri, const bool &relative, std::optional<std::string> name = std::nullopt, std::optional<tiledb_object_t> type = std::nullopt)

Add a member to a group

Parameters:
  • uri – of member to add

  • relative – is the URI relative to the group location

  • name – optional name group member can be given to be looked up by

  • type – the type of the member getting added if known in advance

inline void remove_member(const std::string &name_or_uri)

Remove a member from a group

Parameters:

name_or_uri – Name or URI of member to remove. If the URI is registered multiple times in the group, the name needs to be specified so that the correct one can be removed. Note that if a URI is registered as both a named and unnamed member, the unnamed member will be removed successfully using the URI.

inline bool is_relative(std::string name) const

retrieve the relative attribute for a named member

Parameters:

name – of member to retrieve associated relative indicator.

Public Static Functions

static inline void create(const tiledb::Context &ctx, const std::string &uri)

Create a TileDB Group

  • Example:

    tiledb::Group::create(ctx, "s3://bucket-name/group-name");
    

Parameters:
  • ctx – tiledb context

  • uri – URI where group will be created.

static inline void consolidate_metadata(const Context &ctx, const std::string &uri, Config *const config = nullptr)

Consolidates the group metadata into a single group metadata file.

Example:

tiledb::Group::consolidate_metadata(ctx, "s3://bucket-name/group-name");

Parameters:
  • ctx – TileDB context

  • uri – The URI of the TileDB group to be consolidated.

  • config – Configuration parameters for the consolidation.

static inline void vacuum_metadata(const Context &ctx, const std::string &uri, Config *const config = nullptr)

Cleans up the group metadata.

Example:

tiledb::Group::vacuum_metadata(ctx, "s3://bucket-name/group-name");

Parameters:
  • ctx – TileDB context

  • uri – The URI of the TileDB group to vacuum.

  • config – Configuration parameters for the vacuuming.