TileDB C++ API Reference¶
Context¶
-
class Context¶
A TileDB context wraps a TileDB storage manager “instance.” Most objects and functions will require a Context.
Internal error handling is also defined by the Context; the default error handler throws a TileDBError with a specific message.
Example:
tiledb::Context ctx; // Use ctx when creating other objects: tiledb::ArraySchema schema(ctx, TILEDB_SPARSE); // Set a custom error handler: ctx.set_error_handler([](const std::string &msg) { std::cerr << msg << std::endl; });
Public Functions
-
inline Context()¶
Constructor. Creates a TileDB Context with default configuration.
- Throws:
TileDBError – if construction fails
-
inline explicit Context(const Config &config)¶
Constructor. Creates a TileDB context with the given configuration.
- Throws:
TileDBError – if construction fails
-
inline Context(tiledb_ctx_t *ctx, bool own = true)¶
Constructor. Creates a TileDB context from the given pointer.
- Parameters:
own=true – If false, disables underlying cleanup upon destruction.
- Throws:
TileDBError – if construction fails
-
inline void handle_error(int rc) const¶
Error handler for the TileDB C API calls. Throws an exception in case of error.
- Parameters:
rc – If != TILEDB_OK, calls error handler
-
inline std::string get_last_error_message() const noexcept¶
Get the message of the last error that occurred.
- Returns:
The last error message
-
inline std::shared_ptr<tiledb_ctx_t> ptr() const¶
Returns the C TileDB context object.
-
inline Context &set_error_handler(const std::function<void(const std::string&)> &fn)¶
Sets the error handler callback. If none is set, the
default_error_handler
is used. The callback accepts an error message.- Parameters:
fn – Error handler callback function
- Returns:
Reference to this Context
-
inline bool is_supported_fs(tiledb_filesystem_t fs) const¶
Return true if the given filesystem backend is supported.
Example:
tiledb::Context ctx; bool s3_supported = ctx.is_supported_fs(TILEDB_S3);
- Parameters:
fs – Filesystem to check
-
inline void cancel_tasks() const¶
Cancels all background or async tasks associated with this context.
-
inline void set_tag(const std::string &key, const std::string &value)¶
Sets a string/string KV tag on the context.
-
inline std::string stats()¶
Returns a JSON-formatted string of the stats.
Public Static Functions
-
static inline void default_error_handler(const std::string &msg)¶
The default error handler callback.
- Throws:
TileDBError – with the error message
-
inline Context()¶
Config¶
-
class Config¶
Carries configuration parameters for a context.
Example:
Config conf; conf["vfs.s3.region"] = "us-east-1a"; conf["vfs.s3.use_virtual_addressing"] = "true"; Context ctx(conf); // array/kv operations with ctx
Public Functions
-
inline explicit Config(const std::string &filename)¶
Constructor that takes as input a filename (URI) that stores the config parameters. The file must have the following (text) format:
{parameter} {value}
Anything following a
#
character is considered a comment and, thus, is ignored.See
Config::set
for the various TileDB config parameters and allowed values.- Parameters:
filename – The name of the file where the parameters will be read from.
-
inline explicit Config(tiledb_config_t **config)¶
Constructor from a C config object.
-
inline explicit Config(const std::map<std::string, std::string> &config)¶
Constructor that takes as input a STL map that stores the config parameters
- Parameters:
config –
-
inline explicit Config(const std::unordered_map<std::string, std::string> &config)¶
Constructor that takes as input a STL unordered_map that stores the config parameters
- Parameters:
config –
-
inline void save_to_file(const std::string filename)¶
Saves the config parameters to a (local) text file.
-
inline std::shared_ptr<tiledb_config_t> ptr() const¶
Returns the pointer to the TileDB C config object.
-
inline Config &set(const std::string ¶m, const std::string &value)¶
Sets a config parameter.
sm.allow_separate_attribute_writes
Experimental Allow separate attribute write queries.Default: falsesm.allow_updates_experimental
Experimental Allow update queries. Experimental for testing purposes, do not use.Default: falsesm.dedup_coords
Iftrue
, cells with duplicate coordinates will be removed during sparse fragment writes. Note that ties during deduplication are broken arbitrarily. Also note that this check means that it will take longer to perform the write operation. Default: falsesm.check_coord_dups
This is applicable only ifsm.dedup_coords
isfalse
. Iftrue
, an error will be thrown if there are cells with duplicate coordinates during sparse fragmnet writes. Iffalse
and there are duplicates, the duplicates will be written without errors. Note that this check is much ligher weight than the coordinate deduplication check enabled bysm.dedup_coords
. Default: truesm.check_coord_oob
Iftrue
, an error will be thrown if there are cells with coordinates lying outside the domain during sparse fragment writes. Default: truesm.read_range_oob
Iferror
, this will check ranges for read with out-of-bounds on the dimension domain’s. Ifwarn
, the ranges will be capped at the dimension’s domain and a warning logged. Default: warnsm.check_global_order
Checks if the coordinates obey the global array order. Applicable only to sparse writes in global order. Default: truesm.merge_overlapping_ranges_experimental
Iftrue
, merge overlapping Subarray ranges. Else, overlapping ranges will not be merged and multiplicities will be returned. Experimental for testing purposes, do not use.Default: truesm.enable_signal_handlers
Determines whether or not TileDB will install signal handlers. Default: truesm.compute_concurrency_level
Upper-bound on number of threads to allocate for compute-bound tasks. Default*: # coresUpper-bound on number of threads to allocate for IO-bound tasks. **Default*: # cores
The vacuuming mode, one of (remove only consolidated commit files), (remove only consolidated fragments), (remove only consolidated fragment metadata), (remove only consolidated array metadata files), or (remove only consolidate group metadata only). **Default: fragments
sm.consolidation.mode
The consolidation mode, one ofcommits
(consolidate all commit files),fragments
(consolidate all fragments),fragment_meta
(consolidate only fragment metadata footers to a single file),array_meta
(consolidate array metadata only), orgroup_meta
(consolidate group metadata only). Default: “fragments”sm.consolidation.amplification
The factor by which the size of the dense fragment resulting from consolidating a set of fragments (containing at least one dense fragment) can be amplified. This is important when the union of the non-empty domains of the fragments to be consolidated have a lot of empty cells, which the consolidated fragment will have to fill with the special fill value (since the resulting fragment is dense). Default: 1.0sm.consolidation.buffer_size
Deprecated The size (in bytes) of the attribute buffers used during consolidation. Default: 50,000,000sm.consolidation.max_fragment_size
Experimental The size (in bytes) of the maximum on-disk fragment size that will be created by consolidation. When it is reached, consolidation will continue the operation in a new fragment. The result will be a multiple fragments, but with seperate MBRs.sm.consolidation.steps
The number of consolidation steps to be performed when executing the consolidation algorithm.Default: UINT32_MAXsm.consolidation.purge_deleted_cells
Experimental Purge deleted cells from the consolidated fragment or not.Default: falsesm.consolidation.step_min_frags
The minimum number of fragments to consolidate in a single step.Default: UINT32_MAXsm.consolidation.step_max_frags
The maximum number of fragments to consolidate in a single step.Default: UINT32_MAXsm.consolidation.step_size_ratio
The size ratio that two (“adjacent”) fragments must satisfy to be considered for consolidation in a single step.Default: 0.0sm.consolidation.timestamp_start
Experimental When set, an array will be consolidated between this value andsm.consolidation.timestamp_end
(inclusive).
Only for
fragments
andarray_meta
consolidation mode. Default: 0sm.consolidation.timestamp_end
Experimental When set, an array will be consolidated betweensm.consolidation.timestamp_start
and this value (inclusive).
Only for
fragments
andarray_meta
consolidation mode. Default: UINT64_MAXsm.encryption_key
The key for encrypted arrays. Default: “”sm.encryption_type
The type of encryption used for encrypted arrays. Default: “NO_ENCRYPTION”sm.enumerations_max_size
Maximum in memory size for an enumeration. If the enumeration is
var sized, the size will include the data and the offsets.
Default: 10MBsm.enumerations_max_total_size
Maximum in memory size for all enumerations. If the enumeration
is var sized, the size will include the data and the offsets.
Default: 50MBsm.max_tile_overlap_size
Maximum size for the tile overlap structure which holds
information about which tiles are covered by ranges. Only used
in dense reads and legacy reads. Default: 300MBsm.memory_budget
The memory budget for tiles of fixed-sized attributes (or offsets for var-sized attributes) to be fetched during reads.Default: 5GBsm.memory_budget_var
The memory budget for tiles of var-sized attributes to be fetched during reads.Default: 10GBsm.var_offsets.bitsize
The size of offsets in bits to be used for offset buffers of var-sized attributesDefault: 64sm.var_offsets.extra_element
Add an extra element to the end of the offsets buffer of var-sized attributes which will point to the end of the values buffer.Default: falsesm.var_offsets.mode
The offsets format (bytes
orelements
) to be used for var-sized attributes.Default: bytessm.query.dense.reader
Which reader to use for dense queries. “refactored” or “legacy”.Default: refactoredsm.query.dense.qc_coords_mode
Dense configuration that allows to only return the coordinates of
the cells that match a query condition without any attribute data.
Default: “false”sm.query.sparse_global_order.reader
Which reader to use for sparse global order queries. “refactored” or “legacy”.Default: refactoredsm.query.sparse_unordered_with_dups.reader
Which reader to use for sparse unordered with dups queries. “refactored” or “legacy”.Default: refactoredsm.skip_checksum_validation
Skip checksum validation on reads for the md5 and sha256 filters. Default: “false”sm.mem.malloc_trim
Should malloc_trim be called on context and query destruction? This might reduce residual memory usage. Default: truesm.mem.tile_upper_memory_limit
Experimental This is the upper memory limit that is used when loading tiles. For now it is only used in the dense reader but will be eventually used by all readers. The readers using this value will use it as a way to limit the amount of tile data that is brought into memory at once so that we don’t incur performance penalties during memory movement operations. It is a soft limit that we might go over if a single tile doesn’t fit into memory, we will allow to load that tile if it still fits withinsm.mem.total_budget
. Default: 1GBsm.mem.total_budget
Memory budget for readers and writers. Default: 10GBsm.mem.consolidation.buffers_weight
Weight used to splitsm.mem.total_budget
and assign to the consolidation buffers. The budget is split across 3 values,sm.mem.consolidation.buffers_weight
,sm.mem.consolidation.reader_weight
andsm.mem.consolidation.writer_weight
. Default: 1sm.mem.consolidation.reader_weight
Weight used to splitsm.mem.total_budget
and assign to the reader query. The budget is split across 3 values,sm.mem.consolidation.buffers_weight
,sm.mem.consolidation.reader_weight
andsm.mem.consolidation.writer_weight
. Default: 3sm.mem.consolidation.writer_weight
Weight used to splitsm.mem.total_budget
and assign to the writer query. The budget is split across 3 values,sm.mem.consolidation.buffers_weight
,sm.mem.consolidation.reader_weight
andsm.mem.consolidation.writer_weight
. Default: 2sm.mem.reader.sparse_global_order.ratio_coords
Ratio of the budget allocated for coordinates in the sparse global order reader. Default: 0.5sm.mem.reader.sparse_global_order.ratio_tile_ranges
Ratio of the budget allocated for tile ranges in the sparse global order reader. Default: 0.1sm.mem.reader.sparse_global_order.ratio_array_data
Ratio of the budget allocated for array data in the sparse global order reader. Default: 0.1sm.mem.reader.sparse_unordered_with_dups.ratio_coords
Ratio of the budget allocated for coordinates in the sparse unordered with duplicates reader. Default: 0.5sm.mem.reader.sparse_unordered_with_dups.ratio_tile_ranges
Ratio of the budget allocated for tile ranges in the sparse unordered with duplicates reader. Default: 0.1sm.mem.reader.sparse_unordered_with_dups.ratio_array_data
Ratio of the budget allocated for array data in the sparse unordered with duplicates reader. Default: 0.1 The maximum byte size to read-ahead from the backend. Default: 102400sm.group.timestamp_start
The start timestamp used for opening the group. Default: 0sm.group.timestamp_end
The end timestamp used for opening the group.
Also used for the write timestamp if set.
Default: UINT64_MAXsm.partial_tile_offsets_loading
Experimental Iftrue
tile offsets can be partially loaded and unloaded by the readers. Default: falsesm.fragment_info.preload_mbrs
Iftrue
MBRs will be loaded at the same time as the rest of fragment info, otherwise they will be loaded lazily when some info related to MBRs is requested by the user. Default: falsesm.partial_tile_offset_loading
Experimental Iftrue
tile offsets can be partially loaded and unloaded by the readers. Default: falsessl.ca_file
The path to CA certificate to use when validating server certificates. Applies to all SSL/TLS connections.
This option might be ignored on platforms that have native certificate stores like Windows.
Default: “”ssl.ca_path
The path to a directory with CA certificates to use when validating server certificates. Applies to all SSL/TLS connections.
This option might be ignored on platforms that have native certificate stores like Windows.
Default: “”ssl.verify
Whether to verify the server’s certificate. Applies to all SSL/TLS connections.
Disabling verification is insecure and should only used for testing purposes.
Default: truevfs.read_ahead_cache_size
The the total maximum size of the read-ahead cache, which is an LRU. Default: 10485760vfs.min_parallel_size
The minimum number of bytes in a parallel VFS operation (except parallel S3 writes, which are controlled byvfs.s3.multipart_part_size
). Default: 10MBvfs.max_batch_size
The maximum number of bytes in a VFS read operationDefault: 100MBvfs.min_batch_size
The minimum number of bytes in a VFS read operationDefault: 20MBvfs.min_batch_gap
The minimum number of bytes between two VFS read batches.Default: 500KBvfs.read_logging_mode
Log read operations at varying levels of verbosity.Default: “” Possible values:An empty string disables read logging.
Log each fragment read.
Log each individual fragment file read.
Log all files read.
Log all files with offset and length parameters.
Log all files with offset and length parameters on every read, not just the first read. On large arrays the read cache may get large so this trades of RAM usage vs increased log verbosity.
vfs.file.posix_file_permissions
Permissions to use for posix file system with file creation.Default: 644vfs.file.posix_directory_permissions
Permissions to use for posix file system with directory creation.Default: 755vfs.azure.storage_account_name
Set the name of the Azure Storage account to use. Default: “”vfs.azure.storage_account_key
Set the Shared Key to authenticate to Azure Storage. Default: “”vfs.azure.storage_sas_token
Set the Azure Storage SAS (shared access signature) token to use. If this option is set along withvfs.azure.blob_endpoint
, the latter must not include a SAS token. Default: “”vfs.azure.blob_endpoint
Set the default Azure Storage Blob endpoint.
If not specified, it will take a value of
<account-name>.blob.core.windows.net
, where<account-name>
is the value of thevfs.azure.storage_account_name
option. This means that at least one of these two options must be set (or both if shared key authentication is used). Default: “”vfs.azure.block_list_block_size
The block size (in bytes) used in Azure blob block list writes. Anyuint64_t
value is acceptable. Note:vfs.azure.block_list_block_size vfs.azure.max_parallel_ops
bytes will be buffered before issuing block uploads in parallel. Default: “5242880”vfs.azure.max_parallel_ops
The maximum number of Azure backend parallel operations. Default:sm.io_concurrency_level
vfs.azure.use_block_list_upload
Determines if the Azure backend can use chunked block uploads. Default: “true”vfs.azure.max_retries
The maximum number of times to retry an Azure network request. Default: 5vfs.azure.retry_delay_ms
The minimum permissible delay between Azure netwwork request retry attempts, in milliseconds. Default: 800vfs.azure.max_retry_delay_ms
The maximum permissible delay between Azure netwwork request retry attempts, in milliseconds. Default: 60000vfs.gcs.project_id
Set the GCS project ID to create new buckets to. Not required unless you are going to use the VFS to create buckets. Default: “”vfs.gcs.service_account_key
Experimental Set the JSON string with GCS service account key. Takes precedence overvfs.gcs.workload_identity_configuration
if both are specified. If neither is specified, Application Default Credentials will be used. Default: “”vfs.gcs.workload_identity_configuration
Experimental Set the JSON string with Workload Identity Federation configuration.vfs.gcs.service_account_key
takes precedence over this if both are specified. If neither is specified, Application Default Credentials will be used. Default: “”vfs.gcs.impersonate_service_account
Experimental Set the GCS service account to impersonate. A chain of impersonated accounts can be formed by specifying many service accounts, separated by a comma. Default: “”vfs.gcs.multi_part_size
The part size (in bytes) used in GCS multi part writes. Anyuint64_t
value is acceptable. Note:vfs.gcs.multi_part_size * vfs.gcs.max_parallel_ops
bytes will be buffered before issuing part uploads in parallel. Default: “5242880”vfs.gcs.max_parallel_ops
The maximum number of GCS backend parallel operations. Default:sm.io_concurrency_level
vfs.gcs.use_multi_part_upload
Determines if the GCS backend can use chunked part uploads. Default: “true”vfs.gcs.request_timeout_ms
The maximum amount of time to retry network requests to GCS. Default: “3000”vfs.gcs.max_direct_upload_size
The maximum size in bytes of a direct upload to GCS. Ignored ifvfs.gcs.use_multi_part_upload
is set to true. Default: “10737418240”vfs.s3.region
The S3 region, if S3 is enabled. Default: us-east-1vfs.s3.aws_access_key_id
Set the AWS_ACCESS_KEY_ID Default: “”vfs.s3.aws_secret_access_key
Set the AWS_SECRET_ACCESS_KEY Default: “”vfs.s3.aws_session_token
Set the AWS_SESSION_TOKEN Default: “”vfs.s3.aws_role_arn
Determines the role that we want to assume. Set the AWS_ROLE_ARN Default: “”vfs.s3.aws_external_id
Third party access ID to your resources when assuming a role. Set the AWS_EXTERNAL_ID Default: “”vfs.s3.aws_load_frequency
Session time limit when assuming a role. Set the AWS_LOAD_FREQUENCY Default: “”vfs.s3.aws_session_name
(Optional) session name when assuming a role. Can be used for tracing and bookkeeping. Set the AWS_SESSION_NAME Default: “”vfs.s3.scheme
The S3 scheme (http
orhttps
), if S3 is enabled. Default: httpsvfs.s3.endpoint_override
The S3 endpoint, if S3 is enabled. Default: “”vfs.s3.use_virtual_addressing
The S3 use of virtual addressing (true
orfalse
), if S3 is enabled. Default: truevfs.s3.skip_init
Skip Aws::InitAPI for the S3 layer (true
orfalse
) Default: falsevfs.s3.use_multipart_upload
The S3 use of multi-part upload requests (true
orfalse
), if S3 is enabled. Default: truevfs.s3.max_parallel_ops
The maximum number of S3 backend parallel operations. Default:sm.io_concurrency_level
vfs.s3.multipart_part_size
The part size (in bytes) used in S3 multipart writes. Anyuint64_t
value is acceptable. Note:vfs.s3.multipart_part_size * vfs.s3.max_parallel_ops
bytes will be buffered before issuing multipart uploads in parallel. Default: 5MBvfs.s3.ca_file
Path to SSL/TLS certificate file to be used by cURL for for S3 HTTPS encryption. Follows cURL conventions: https://curl.haxx.se/docs/manpage.html Default: “”vfs.s3.ca_path
Path to SSL/TLS certificate directory to be used by cURL for S3 HTTPS encryption. Follows cURL conventions: https://curl.haxx.se/docs/manpage.html Default: “”vfs.s3.connect_timeout_ms
The connection timeout in ms. Anylong
value is acceptable. Default: 10800vfs.s3.connect_max_tries
The maximum tries for a connection. Anylong
value is acceptable. Default: 5vfs.s3.connect_scale_factor
The scale factor for exponential backoff when connecting to S3. Anylong
value is acceptable. Default: 25vfs.s3.custom_headers.*
(Optional) Prefix for custom headers on s3 requests. For each custom header, use “vfs.s3.custom_headers.header_key” = “header_value” Optional. No Defaultvfs.s3.logging_level
The AWS SDK logging level. This is a process-global setting. The configuration of the most recently constructed context will set process state. Log files are written to the process working directory. Default: “Off”vfs.s3.request_timeout_ms
The request timeout in ms. Anylong
value is acceptable. Default: 3000vfs.s3.requester_pays
The requester pays for the S3 access charges. Default: falsevfs.s3.proxy_host
The S3 proxy host. Default: “”vfs.s3.proxy_port
The S3 proxy port. Default: 0vfs.s3.proxy_scheme
The S3 proxy scheme. Default: “http”vfs.s3.proxy_username
The S3 proxy username. Note: this parameter is not serialized bytiledb_config_save_to_file
. Default: “”vfs.s3.proxy_password
The S3 proxy password. Note: this parameter is not serialized bytiledb_config_save_to_file
. Default: “”vfs.s3.verify_ssl
Enable HTTPS certificate verification. Default: true””vfs.s3.no_sign_request
Make unauthenticated requests to s3. Default: falsevfs.s3.sse
The server-side encryption algorithm to use. Supported non-empty values are “aes256” and “kms” (AWS key management service). Default: “”vfs.s3.sse_kms_key_id
The server-side encryption key to use if vfs.s3.sse == “kms” (AWS key management service). Default: “”vfs.s3.storage_class
The storage class to use for the newly uploaded S3 objects. The set of accepted values is found in the Aws::S3::Model::StorageClass enumeration. “NOT_SET” “STANDARD” “REDUCED_REDUNDANCY” “STANDARD_IA” “ONEZONE_IA” “INTELLIGENT_TIERING” “GLACIER” “DEEP_ARCHIVE” “OUTPOSTS” “GLACIER_IR” “SNOW” “EXPRESS_ONEZONE” Default: “NOT_SET”vfs.s3.bucket_canned_acl
Names of values found in Aws::S3::Model::BucketCannedACL enumeration. “NOT_SET” “private_” “public_read” “public_read_write” “authenticated_read” Default: “NOT_SET”vfs.s3.object_canned_acl
Names of values found in Aws::S3::Model::ObjectCannedACL enumeration. (The first 5 are the same as for “vfs.s3.bucket_canned_acl”.) “NOT_SET” “private_” “public_read” “public_read_write” “authenticated_read” (The following three items are found only in Aws::S3::Model::ObjectCannedACL.) “aws_exec_read” “owner_read” “bucket_owner_full_control” Default: “NOT_SET”vfs.s3.config_source
Force S3 SDK to only load config options from a set source. The supported options areauto
(TileDB config options are considered first, then SDK-defined precedence: env vars, config files, ec2 metadata),config_files
(forces SDK to only consider options found in aws config files),sts_profile_with_web_identity
(force SDK to consider assume roles/sts from config files with support for web tokens, commonly used by EKS/ECS). Default: autovfs.s3.install_sigpipe_handler
When set totrue
, the S3 SDK uses a handler that ignores SIGPIPE signals. Default: “true”vfs.hdfs.name_node_uri
Name node for HDFS. Default: “”vfs.hdfs.username
HDFS username. Default: “”vfs.hdfs.kerb_ticket_cache_path
HDFS kerb ticket cache path. Default: “”config.env_var_prefix
Prefix of environmental variables for reading configuration parameters. Default: “TILEDB_”config.logging_level
The logging level configured, possible values: “0”: fatal, “1”: error, “2”: warn, “3”: info “4”: debug, “5”: trace Default: “1” if —enable-verbose bootstrap flag is provided, “0” otherwiseconfig.logging_format
The logging format configured (DEFAULT or JSON) Default: “DEFAULT”rest.server_address
URL for REST server to use for remote arrays. Default: “https://api.tiledb.com”rest.server_serialization_format
Serialization format to use for remote array requests (CAPNP or JSON). Default: “CAPNP”rest.username
Username for login to REST server. Default: “”rest.password
Password for login to REST server. Default: “”rest.token
Authentication token for REST server (used instead of username/password). Default: “”rest.resubmit_incomplete
If true, incomplete queries received from server are automatically resubmitted before returning to user control. Default: “true”rest.ignore_ssl_validation
Have curl ignore ssl peer and host validation for REST server. Default: falserest.creation_access_credentials_name
The name of the registered access key to use for creation of the REST server. Default: no default setrest.retry_http_codes
CSV list of http status codes to automatically retry a REST request for Default: “503”rest.retry_count
Number of times to retry failed REST requests Default: 25rest.retry_initial_delay_ms
Initial delay in milliseconds to wait until retrying a REST request Default: 500rest.retry_delay_factor
The delay factor to exponentially wait until further retries of a failed REST request Default: 1.25rest.curl.retry_errors
If true any curl requests that returned an error will be retried Default: truerest.curl.verbose
Set curl to run in verbose mode for REST requests
curl will print to stdout with this option
Default: falserest.load_metadata_on_array_open
If true, array metadata will be loaded and sent to server together with the open array Default: truerest.load_non_empty_domain_on_array_open
If true, array non empty domain will be loaded and sent to server together with the open array Default: truerest.use_refactored_array_open
If true, the new REST routes and APIs for opening an array will be used Default: truerest.use_refactored_array_open_and_query_submit
If true, the new REST routes and APIs for opening an array and submitting a query will be used Default: truerest.curl.buffer_size
Set curl buffer size for REST requests Default: 524288 (512KB)rest.capnp_traversal_limit
CAPNP traversal limit used in the deserialization of messages(bytes) Default: 2147483648 (2GB)rest.custom_headers.*
(Optional) Prefix for custom headers on REST requests. For each custom header, use “rest.custom_headers.header_key” = “header_value” Optional. No Defaultrest.payer_namespace
The namespace that should be charged for the request. Default: no default setfilestore.buffer_size
Specifies the size in bytes of the internal buffers used in the filestore API. The size should be bigger than the minimum tile size filestore currently supports, that is currently 1024bytes. Default: 100MB
-
inline std::string get(const std::string ¶m) const¶
Get a parameter from the configuration by key.
- Parameters:
param – Name of configuration parameter
- Throws:
TileDBError – if the parameter does not exist
- Returns:
Value of configuration parameter
-
inline bool contains(const std::string_view ¶m) const¶
Check if a configuration parameter exists.
- Parameters:
param – Name of configuration parameter
- Returns:
true if the parameter exists, false otherwise
-
inline impl::ConfigProxy operator[](const std::string ¶m)¶
Operator that enables setting parameters with
[]
.Example:
Config conf; conf["vfs.s3.region"] = "us-east-1a"; conf["vfs.s3.use_virtual_addressing"] = "true"; Context ctx(conf);
- Parameters:
param – Name of parameter to set
- Returns:
“Proxy” object supporting assignment.
-
inline Config &unset(const std::string ¶m)¶
Resets a config parameter to its default value.
- Parameters:
param – Name of parameter
- Returns:
Reference to this Config instance
-
inline iterator begin(const std::string &prefix)¶
Iterate over params starting with a prefix.
Example:
tiledb::Config config; for (auto it = config.begin("vfs"), ite = config.end(); it != ite; ++it) { std::string name = it->first, value = it->second; }
- Parameters:
prefix – Prefix to iterate over
- Returns:
iterator
-
inline iterator begin()¶
Iterate over all params.
Example:
tiledb::Config config; for (auto it = config.begin(), ite = config.end(); it != ite; ++it) { std::string name = it->first, value = it->second; }
- Returns:
iterator
-
inline iterator end()¶
End iterator.
Public Static Functions
-
static inline void free(tiledb_config_t *config)¶
Wrapper function for freeing a config C object.
-
inline explicit Config(const std::string &filename)¶
Exceptions¶
-
struct TileDBError : public std::runtime_error¶
Exception indicating a TileDB error.
Subclassed by tiledb::AttributeError, tiledb::SchemaMismatch, tiledb::TypeError
-
struct TypeError : public tiledb::TileDBError¶
Exception indicating a mismatch between a static and runtime type
Subclassed by tiledb::FilterOptionTypeError< Expected, Actual >
-
struct SchemaMismatch : public tiledb::TileDBError¶
Exception indicating the requested operation does not match array schema
-
struct AttributeError : public tiledb::TileDBError¶
Error related to attributes
Dimension¶
-
class Dimension¶
Describes one dimension of an Array. The dimension consists of a type, lower and upper bound, and tile-extent describing the memory ordering. Dimensions are added to a Domain.
Example:
tiledb::Context ctx; tiledb::Domain domain(ctx); // Create a dimension with inclusive domain [0,1000] and tile extent 100. domain.add_dimension(Dimension::create<int32_t>(ctx, "d", {{0, 1000}}, 100));
Note: as laid out in the Storage Format, the following Datatypes are not valid for Dimension: TILEDB_CHAR, TILEDB_BLOB, TILEDB_GEOM_WKB, TILEDB_GEOM_WKT, TILEDB_BOOL, TILEDB_STRING_UTF8, TILEDB_STRING_UTF16, TILEDB_STRING_UTF32, TILEDB_STRING_UCS2, TILEDB_STRING_UCS4, TILEDB_ANY
Public Functions
-
inline unsigned cell_val_num() const¶
Returns number of values of one cell on this dimension. For variable-sized dimensions returns TILEDB_VAR_NUM.
-
inline FilterList filter_list() const¶
Returns a copy of the FilterList of the dimemnsion. To change the filter list, use
set_filter_list()
.
-
inline Dimension &set_filter_list(const FilterList &filter_list)¶
Sets the dimension filter list, which is an ordered list of filters that will be used to process and/or transform the coordinate data (such as compression).
-
inline const std::string name() const¶
Returns the name of the dimension.
-
inline tiledb_datatype_t type() const¶
Returns the dimension datatype.
-
template<typename T>
inline std::pair<T, T> domain() const¶ Returns the domain of the dimension.
- Template Parameters:
T – Domain datatype
- Returns:
Pair of [lower, upper] inclusive bounds.
-
inline std::string domain_to_str() const¶
Returns a string representation of the domain.
- Throws:
TileDBError – if the domain cannot be stringified (TILEDB_ANY)
-
inline std::string tile_extent_to_str() const¶
Returns a string representation of the extent.
- Throws:
TileDBError – if the domain cannot be stringified (TILEDB_ANY)
-
inline std::shared_ptr<tiledb_dimension_t> ptr() const¶
Returns a shared pointer to the C TileDB dimension object.
-
TILEDB_DEPRECATED inline void dump(FILE *out = stdout) const¶
Dumps information about the dimension in an ASCII representation to an output.
- Parameters:
out – (Optional) File to dump output to. Defaults to
stdout
.
Public Static Functions
-
template<typename T>
static inline Dimension create(const Context &ctx, const std::string &name, const std::array<T, 2> &domain, T extent)¶ Factory function for creating a new dimension with datatype T.
Example:
tiledb::Context ctx; // Create a dimension with inclusive domain [0,1000] and tile extent 100. auto dim = Dimension::create<int32_t>(ctx, "d", {{0, 1000}}, 100);
- Template Parameters:
T – int, char, etc…
- Parameters:
ctx – The TileDB context.
name – The dimension name.
domain – The dimension domain. A pair [lower,upper] of inclusive bounds.
extent – The tile extent on the dimension.
- Returns:
A new
Dimension
object.
-
template<typename T>
static inline Dimension create(const Context &ctx, const std::string &name, const std::array<T, 2> &domain)¶ Factory function for creating a new dimension with datatype T and without specifying a tile extent.
Example:
tiledb::Context ctx; // Create a dimension with inclusive domain [0,1000] and no tile extent. auto dim = Dimension::create<int32_t>(ctx, "d", {{0, 1000}});
- Template Parameters:
T – int, char, etc…
- Parameters:
ctx – The TileDB context.
name – The dimension name.
domain – The dimension domain. A pair [lower,upper] of inclusive bounds.
- Returns:
A new
Dimension
object.
-
static inline Dimension create(const Context &ctx, const std::string &name, tiledb_datatype_t datatype, const void *domain, const void *extent)¶
Factory function for creating a new dimension (non typechecked).
- Parameters:
ctx – The TileDB context.
name – The dimension name.
datatype – The dimension datatype.
domain – The dimension domain. A pair [lower,upper] of inclusive bounds.
extent – The tile extent on the dimension.
- Returns:
A new
Dimension
object.
-
inline unsigned cell_val_num() const¶
Domain¶
-
class Domain¶
Represents the domain of an array.
A Domain defines the set of Dimension objects for a given array. The properties of a Domain derive from the underlying dimensions. A Domain is a component of an ArraySchema.
Example:
tiledb::Context ctx; tiledb::Domain domain; // Note the dimension bounds are inclusive. auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10}); auto d2 = tiledb::Dimension::create<uint64_t>(ctx, "d2", {1, 10}); auto d3 = tiledb::Dimension::create<int>(ctx, "d3", {-100, 100}); domain.add_dimension(d1); domain.add_dimension(d2); // Throws error, all dims must be same type domain.add_dimension(d3); domain.cell_num(); // (10 - -10 + 1) * (10 - 1 + 1) = 210 max cells domain.type(); // TILEDB_INT32, determined from the dimensions domain.rank(); // 2, d1 and d2 tiledb::ArraySchema schema(ctx, TILEDB_DENSE); schema.set_domain(domain); // Set the array's domain
Note
The dimension can only be signed or unsigned integral types, as well as floating point for sparse array domains.
Public Functions
-
inline uint64_t cell_num() const¶
Returns the total number of cells in the domain. Throws an exception if the domain type is
float32
orfloat64
.- Throws:
TileDBError – if cell_num cannot be computed.
-
TILEDB_DEPRECATED inline void dump(FILE *out = stdout) const¶
Dumps the domain in an ASCII representation to an output.
- Parameters:
out – (Optional) File to dump output to. Defaults to
stdout
.
-
inline tiledb_datatype_t type() const¶
Returns the domain type.
-
inline unsigned ndim() const¶
Returns the number of dimensions.
-
inline std::vector<Dimension> dimensions() const¶
Returns the current set of dimensions in the domain.
-
inline Dimension dimension(const std::string &name) const¶
Returns the dimensions with the given name.
-
inline Domain &add_dimension(const Dimension &d)¶
Adds a new dimension to the domain.
Example:
tiledb::Context ctx; tiledb::Domain domain; auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10}); domain.add_dimension(d1);
-
template<typename ...Args>
inline Domain &add_dimensions(Args... dims)¶ Adds multiple dimensions to the domain.
Example:
tiledb::Context ctx; tiledb::Domain domain; auto d1 = tiledb::Dimension::create<int>(ctx, "d1", {-10, 10}); auto d2 = tiledb::Dimension::create<int>(ctx, "d2", {1, 10}); auto d3 = tiledb::Dimension::create<int>(ctx, "d3", {-100, 100}); domain.add_dimensions(d1, d2, d3);
- Template Parameters:
Args – Variadic dimension datatype
- Parameters:
dims – Dimensions to add
- Returns:
Reference to this Domain.
-
inline bool has_dimension(const std::string &name) const¶
Checks if the domain has a dimension of the given name.
- Parameters:
name – Name of dimension to check for
- Returns:
True if the domain has a dimension of the given name.
-
inline std::shared_ptr<tiledb_domain_t> ptr() const¶
Returns a shared pointer to the C TileDB domain object.
-
inline uint64_t cell_num() const¶
Attribute¶
-
class Attribute¶
Describes an attribute of an Array cell.
An attribute specifies a name and datatype for a particular value in each array cell. There are 3 supported attribute types:
Fundamental types, such as
char
,int
,double
,uint64_t
, etc..Fixed sized arrays:
T[N]
orstd::array<T, N>
, where T is a fundamental typeVariable length data:
std::string
,std::vector<T>
where T is a fundamental type
Fixed-size array types using POD types like
std::array<T, N>
are internally converted to byte-array attributes. E.g. an attribute of typestd::array<float, 3>
will be created as an attribute of typeTILEDB_CHAR
with cell_val_numsizeof(std::array<float, 3>)
.Therefore, for fixed-length attributes it is recommended to use C-style arrays instead, e.g.
float[3]
instead ofstd::array<float, 3>
.Example:
tiledb::Context ctx; auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3"); // Change compression scheme tiledb::FilterList filters(ctx); filters.add_filter({ctx, TILEDB_FILTER_BZIP2}); a1.set_filter_list(filters); // Add attributes to a schema tiledb::ArraySchema schema(ctx, TILEDB_DENSE); schema.add_attributes(a1, a2, a3);
Public Functions
-
inline Attribute(const Context &ctx, const std::string &name, tiledb_datatype_t type)¶
Construct an attribute with a name and enumerated type.
cell_val_num
will be set to 1.- Parameters:
ctx – TileDB context
name – Name of attribute
type – Enumerated type of attribute
-
inline Attribute(const Context &ctx, const std::string &name, tiledb_datatype_t type, const FilterList &filter_list)¶
Construct an attribute with an enumerated type and given filter list.
-
inline std::string name() const¶
Returns the name of the attribute.
-
inline tiledb_datatype_t type() const¶
Returns the attribute datatype.
-
inline uint64_t cell_size() const¶
Returns the size (in bytes) of one cell on this attribute. For variable-sized attributes returns TILEDB_VAR_NUM.
Example:
tiledb::Context ctx; auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3"); auto a4 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a4"); a1.cell_size(); // Returns sizeof(int) a2.cell_size(); // Variable sized attribute, returns TILEDB_VAR_NUM a3.cell_size(); // Returns 3 * sizeof(float) a4.cell_size(); // Stored as byte array, returns sizeof(char).
-
inline unsigned cell_val_num() const¶
Returns number of values of one cell on this attribute. For variable-sized attributes returns TILEDB_VAR_NUM.
Example:
tiledb::Context ctx; auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); auto a3 = tiledb::Attribute::create<float[3]>(ctx, "a3"); auto a4 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a4"); a1.cell_val_num(); // Returns 1 a2.cell_val_num(); // Variable sized attribute, returns TILEDB_VAR_NUM a3.cell_val_num(); // Returns 3 a4.cell_val_num(); // Stored as byte array, returns sizeof(std::array<float, 3>).
-
inline Attribute &set_cell_val_num(unsigned num)¶
Sets the number of attribute values per cell. This is inferred from the type parameter of the
Attribute::create<T>()
function, but can also be set manually.Example:
// a1 and a2 are equivalent: auto a1 = Attribute::create<std::vector<int>>(...); auto a2 = Attribute::create<int>(...); a2.set_cell_val_num(TILEDB_VAR_NUM);
- Parameters:
num – Cell val number to set.
- Returns:
Reference to this Attribute
-
inline Attribute &set_fill_value(const void *value, uint64_t size)¶
Sets the default fill value for the input attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).
Applicable to var-sized attributes.
Example:
tiledb::Context ctx; // Fixed-sized attribute auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); int32_t value = 0; uint64_t size = sizeof(value); a1.set_fill_value(&value, size); // Var-sized attribute auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); std::string value("null"); a2.set_fill_value(value.c_str(), value.size());
Note
A call to
cell_val_num
sets the fill value of the attribute to its default. Therefore, make sure you invokeset_fill_value
after deciding on the number of values this attribute will hold in each cell.Note
For fixed-sized attributes, the input
size
should be equal to the cell size.- Parameters:
value – The fill value to set.
size – The fill value size in bytes.
-
inline void get_fill_value(const void **value, uint64_t *size)¶
Gets the default fill value for the input attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).
Applicable to both fixed-sized and var-sized attributes.
Example:
// Fixed-sized attribute auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); const int32_t* value; uint64_t size; a1.get_fill_value(&value, &size); // Var-sized attribute auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); const char* value; uint64_t size; a2.get_fill_value(&value, &size);
- Parameters:
value – A pointer to the fill value to get.
size – The size of the fill value to get.
-
inline Attribute &set_fill_value(const void *value, uint64_t size, uint8_t valid)¶
Sets the default fill value for the input, nullable attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).
Applicable to var-sized attributes.
Example:
tiledb::Context ctx; // Fixed-sized attribute auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); a1.set_nullable(true); int32_t value = 0; uint64_t size = sizeof(value); uint8_t valid = 0; a1.set_fill_value(&value, size, valid); // Var-sized attribute auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); a2.set_nullable(true); std::string value("null"); uint8_t valid = 0; a2.set_fill_value(value.c_str(), value.size(), valid);
Note
A call to
cell_val_num
sets the fill value of the attribute to its default. Therefore, make sure you invokeset_fill_value
after deciding on the number of values this attribute will hold in each cell.Note
For fixed-sized attributes, the input
size
should be equal to the cell size.- Parameters:
value – The fill value to set.
size – The fill value size in bytes.
valid – The validity fill value, zero for a null value and non-zero for a valid attribute.
-
inline void get_fill_value(const void **value, uint64_t *size, uint8_t *valid)¶
Gets the default fill value for the input attribute. This value will be used for the input attribute whenever querying (1) an empty cell in a dense array, or (2) a non-empty cell (in either dense or sparse array) when values on the input attribute are missing (e.g., if the user writes a subset of the attributes in a write operation).
Applicable to both fixed-sized and var-sized attributes.
Example:
// Fixed-sized attribute auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); a1.set_nullable(true); const int32_t* value; uint64_t size; uint8_t valid; a1.get_fill_value(&value, &size, &valid); // Var-sized attribute auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); a2.set_nullable(true); const char* value; uint64_t size; uint8_t valid; a2.get_fill_value(&value, &size, &valid);
- Parameters:
value – A pointer to the fill value to get.
size – The size of the fill value to get.
valid – The fill value validity to get.
-
inline bool variable_sized() const¶
Check if attribute is variable sized.
-
inline FilterList filter_list() const¶
Returns a copy of the FilterList of the attribute. To change the filter list, use
set_filter_list()
.- Returns:
Copy of the attribute FilterList.
-
inline Attribute &set_filter_list(const FilterList &filter_list)¶
Sets the attribute filter list, which is an ordered list of filters that will be used to process and/or transform the attribute data (such as compression).
-
inline Attribute &set_nullable(bool nullable)¶
Sets the nullability of an attribute.
Example:
auto a1 = Attribute::create<int>(...); a1.set_nullable(true);
- Parameters:
nullable – Whether the attribute is nullable.
- Returns:
Reference to this Attribute
-
inline bool nullable() const¶
Gets the nullability of an attribute.
Example:
auto a1 = Attribute::create<int>(...); auto nullable = a1.nullable();
- Returns:
Whether the attribute is nullable.
-
inline std::shared_ptr<tiledb_attribute_t> ptr() const¶
Returns the C TileDB attribute object pointer.
-
TILEDB_DEPRECATED inline void dump(FILE *out = stdout) const¶
Dumps information about the attribute in an ASCII representation to an output.
- Parameters:
out – (Optional) File to dump output to. Defaults to
stdout
.
Public Static Functions
-
template<typename T>
static inline Attribute create(const Context &ctx, const std::string &name)¶ Factory function for creating a new attribute with datatype T.
Example:
tiledb::Context ctx; auto a1 = tiledb::Attribute::create<int>(ctx, "a1"); auto a2 = tiledb::Attribute::create<std::string>(ctx, "a2"); auto a3 = tiledb::Attribute::create<std::array<float, 3>>(ctx, "a3"); auto a4 = tiledb::Attribute::create<std::vector<double>>(ctx, "a4"); auto a5 = tiledb::Attribute::create<char[8]>(ctx, "a5");
- Template Parameters:
T – Datatype of the attribute. Can either be arithmetic type, C-style array, std::string, std::vector, or any trivially copyable classes (defined by std::is_trivially_copyable).
- Parameters:
ctx – The TileDB context.
name – The attribute name.
- Returns:
A new Attribute object.
-
static inline Attribute create(const Context &ctx, const std::string &name, tiledb_datatype_t type)¶
Factory function taking the type as a tiledb_datatype_t variable.
-
template<typename T>
static inline Attribute create(const Context &ctx, const std::string &name, const FilterList &filter_list)¶ Factory function for creating a new attribute with datatype T and a FilterList.
Example:
tiledb::Context ctx; tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2}); auto a1 = tiledb::Attribute::create<int>(ctx, "a1", filter_list);
- Template Parameters:
T – Datatype of the attribute. Can either be arithmetic type, C-style array,
std::string
,std::vector
, or any trivially copyable classes (defined bystd::is_trivially_copyable
).- Parameters:
ctx – The TileDB context.
name – The attribute name.
filter_list – FilterList to use for attribute
- Returns:
A new Attribute object.
Array Schema¶
-
class ArraySchema : public tiledb::Schema¶
Schema describing an array.
The schema is an independent description of an array. A schema can be used to create multiple array’s, and stores information about its domain, cell types, and compression details. An array schema is composed of:
A Domain
A set of Attributes
Memory layout definitions: tile and cell
Compression details for Array level factors like offsets and coordinates
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); // Or TILEDB_DENSE // Create a Domain tiledb::Domain domain(...); // Create Attributes auto a1 = tiledb::Attribute::create(...); schema.set_domain(domain); schema.add_attribute(a1); // Specify tile memory layout schema.set_tile_order(TILEDB_ROW_MAJOR); // Specify cell memory layout within each tile schema.set_cell_order(TILEDB_ROW_MAJOR); schema.set_capacity(10); // For sparse, set capacity of each tile // Create the array on persistent storage with the schema. tiledb::Array::create("my_array", schema);
Public Functions
-
inline explicit ArraySchema(const Context &ctx, tiledb_array_type_t type)¶
Creates a new array schema.
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE);
- Parameters:
ctx – TileDB context
type – Array type, sparse or dense.
-
inline ArraySchema(const Context &ctx, const std::string &uri)¶
Loads the schema of an existing array.
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx, "s3://bucket-name/array-name");
- Parameters:
ctx – TileDB context
uri – URI of array
-
inline ArraySchema(const Context &ctx, tiledb_array_schema_t *schema)¶
Loads the schema of an existing array with the input C array schema object.
- Parameters:
ctx – TileDB context
schema – C API array schema object
-
TILEDB_DEPRECATED inline virtual void dump(FILE *out = stdout) const override¶
Dumps the array schema in an ASCII representation to an output.
- Parameters:
out – (Optional) File to dump output to. Defaults to
stdout
.
-
inline tiledb_array_type_t array_type() const¶
Returns the array type.
-
inline uint64_t capacity() const¶
Returns the tile capacity.
-
inline ArraySchema &set_capacity(uint64_t capacity)¶
Sets the tile capacity.
- Parameters:
capacity – The capacity of a sparse data tile. Note that sparse data tiles exist in sparse fragments, which can be created in sparse arrays only. For more details, see tutorials/tiling-sparse.html.
- Returns:
Reference to this
ArraySchema
instance.
-
inline bool allows_dups() const¶
Returns
true
if the array allows coordinate duplicates.
-
inline ArraySchema &set_allows_dups(bool allows_dups)¶
Sets whether the array allows coordinate duplicates. It throws an exception in case it sets
true
to a dense array.
-
inline uint32_t version() const¶
Returns the version of the array schema object.
-
inline tiledb_layout_t tile_order() const¶
Returns the tile order.
-
inline ArraySchema &set_tile_order(tiledb_layout_t layout)¶
Sets the tile order.
- Parameters:
layout – Tile order to set.
- Returns:
Reference to this
ArraySchema
instance.
-
inline ArraySchema &set_order(const std::array<tiledb_layout_t, 2> &p)¶
Sets both the tile and cell orders.
- Parameters:
layout – Pair of {tile order, cell order}
- Returns:
Reference to this
ArraySchema
instance.
-
inline tiledb_layout_t cell_order() const¶
Returns the cell order.
-
inline ArraySchema &set_cell_order(tiledb_layout_t layout)¶
Sets the cell order.
- Parameters:
layout – Cell order to set.
- Returns:
Reference to this
ArraySchema
instance.
-
inline FilterList coords_filter_list() const¶
Returns a copy of the FilterList of the coordinates. To change the coordinate compressor, use
set_coords_filter_list()
.- Returns:
Copy of the coordinates FilterList.
-
inline ArraySchema &set_coords_filter_list(const FilterList &filter_list)¶
Sets the FilterList for the coordinates, which is an ordered list of filters that will be used to process and/or transform the coordinate data (such as compression).
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2}); schema.set_coords_filter_list(filter_list);
- Parameters:
filter_list – FilterList to use
- Returns:
Reference to this
ArraySchema
instance.
-
inline FilterList offsets_filter_list() const¶
Returns a copy of the FilterList of the offsets. To change the offsets compressor, use
set_offsets_filter_list()
.- Returns:
Copy of the offsets FilterList.
-
inline FilterList validity_filter_list() const¶
Returns a copy of the FilterList of the validity arrays. To change the validity compressor, use
set_validity_filter_list()
.- Returns:
Copy of the validity FilterList.
-
inline ArraySchema &set_offsets_filter_list(const FilterList &filter_list)¶
Sets the FilterList for the offsets, which is an ordered list of filters that will be used to process and/or transform the offsets data (such as compression).
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_POSITIVE_DELTA}) .add_filter({ctx, TILEDB_FILTER_LZ4}); schema.set_offsets_filter_list(filter_list);
- Parameters:
filter_list – FilterList to use
- Returns:
Reference to this
ArraySchema
instance.
-
inline ArraySchema &set_validity_filter_list(const FilterList &filter_list)¶
Sets the FilterList for the validity arrays, which is an ordered list of filters that will be used to process and/or transform the validity data (such as compression).
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_POSITIVE_DELTA}) .add_filter({ctx, TILEDB_FILTER_LZ4}); schema.set_validity_filter_list(filter_list);
- Parameters:
filter_list – FilterList to use
- Returns:
Reference to this
ArraySchema
instance.
-
inline Domain domain() const¶
Returns a copy of the schema’s array Domain. To change the domain, use
set_domain()
.- Returns:
Copy of the array Domain
-
inline ArraySchema &set_domain(const Domain &domain)¶
Sets the array domain.
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); // Create a Domain tiledb::Domain domain(...); schema.set_domain(domain);
- Parameters:
domain – Domain to use
- Returns:
Reference to this
ArraySchema
instance.
-
inline std::pair<uint64_t, uint64_t> timestamp_range()¶
Get timestamp range of schema.
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); std::pair<uint64_t, uint64_t> timestamp_range = schema.timestamp_range();
- Returns:
Timestamp range of this
ArraySchema
instance.
-
inline virtual ArraySchema &add_attribute(const Attribute &attr) override¶
Adds an Attribute to the array.
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); schema.add_attribute(Attribute::create<int32_t>(ctx.ptr().get(), "attr_name"));
- Parameters:
attr – The Attribute to add
- Returns:
Reference to this
ArraySchema
instance.
-
inline std::shared_ptr<tiledb_array_schema_t> ptr() const¶
Returns a shared pointer to the C TileDB domain object.
-
inline virtual void check() const override¶
Validates the schema.
Example:
tiledb::Context ctx; tiledb::ArraySchema schema(ctx.ptr().get(), TILEDB_SPARSE); // Add domain, attributes, etc... try { schema.check(); } catch (const tiledb::TileDBError& e) { std::cout << e.what() << "\n"; exit(1); }
- Throws:
TileDBError – if the schema is incorrect or invalid.
-
inline virtual std::unordered_map<std::string, Attribute> attributes() const override¶
Gets all attributes in the array.
- Returns:
Map of attribute name to copy of Attribute instance.
-
inline virtual Attribute attribute(const std::string &name) const override¶
Get a copy of an Attribute in the schema by name.
- Parameters:
name – Name of attribute
- Returns:
-
inline virtual unsigned attribute_num() const override¶
Returns the number of attributes in the schema.
-
inline virtual Attribute attribute(unsigned int i) const override¶
Get a copy of an Attribute in the schema by index. Attributes are ordered the same way they were defined when constructing the array schema.
- Parameters:
i – Index of attribute
- Returns:
-
inline bool has_attribute(const std::string &name) const¶
Checks if the schema has an attribute of the given name.
- Parameters:
name – Name of attribute to check for
- Returns:
True if the schema has an attribute of the given name.
Array¶
-
class Array¶
Class representing a TileDB array object.
An Array object represents array data in TileDB at some persisted location, e.g. on disk, in an S3 bucket, etc. Once an array has been opened for reading or writing, interact with the data through Query objects.
Example:
tiledb::Context ctx; // Create an ArraySchema, add attributes, domain, etc. tiledb::ArraySchema schema(...); // Create empty array named "my_array" on persistent storage. tiledb::Array::create("my_array", schema);
Public Functions
-
inline Array(const Context &ctx, const std::string &array_uri, tiledb_query_type_t query_type, const TemporalPolicy temporal_policy = {}, const EncryptionAlgorithm encryption_algorithm = {})¶
Constructor. This opens the array for the given query type. The destructor calls the
close()
method.Example:
// Open the array for reading tiledb::Context ctx; tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ);
- Parameters:
ctx – TileDB context.
array_uri – The array URI.
query_type – Query type to open the array for.
temporal_policy – The TemporalPolicy with which to open the array.
encryption_algorithm – The EncryptionAlgorithm to set on the array.
-
inline Array(const Context &ctx, tiledb_array_t *carray, tiledb_config_t *config)¶
Constructor. This sets the array config.
Example:
tiledb::Context ctx; tiledb_config_t* config;
- Parameters:
ctx – TileDB context.
carray – The array.
config – The array’s config.
-
inline Array(const Context &ctx, tiledb_array_t *carray, bool own = true)¶
Constructor. Creates a TileDB Array instance wrapping the given pointer.
- Parameters:
ctx – tiledb::Context
own=true – If false, disables underlying cleanup upon destruction.
- Throws:
TileDBError – if construction fails
-
inline bool is_open() const¶
Checks if the array is open.
-
inline std::string uri() const¶
Returns the array URI.
-
inline ArraySchema schema() const¶
Get the ArraySchema for the array.
-
inline std::shared_ptr<tiledb_array_t> ptr() const¶
Returns a shared pointer to the C TileDB array object.
-
inline void open(tiledb_query_type_t query_type)¶
Opens the array. The array is opened using a query type as input.
This is to indicate that queries created for this
Array
object will inherit the query type. In other words,Array
objects are opened to receive only one type of queries. They can always be closed and be re-opened with another query type. Also there may be many differentArray
objects created and opened with different query types. For instance, one may create and open an array objectarray_read
for reads and another onearray_write
for writes, and interleave creation and submission of queries for both these array objects.Example:
// Open the array for writing tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_WRITE); // Close and open again for reading. array.close(); array.open(TILEDB_READ);
- Parameters:
query_type – The type of queries the array object will be receiving.
- Throws:
TileDBError – if the array is already open or other error occurred.
-
inline void open(tiledb_query_type_t query_type, uint64_t timestamp)¶
Opens the array. The array is opened using a query type as input.
See Array::open
-
inline void open(tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const std::string &encryption_key)¶
Opens the array. The array is opened using a query type as input.
See Array::open
-
inline void open(tiledb_query_type_t query_type, tiledb_encryption_type_t encryption_type, const std::string &encryption_key, uint64_t timestamp)¶
Opens the array. The array is opened using a query type as input.
See Array::open
-
inline void reopen()¶
Reopens the array (the array must be already open). This is useful when the array got updated after it got opened and the
Array
object got created. To sync-up with the updates, the user must either close the array and open withopen()
, or just usereopen()
without closing. This function will be generally faster than the former alternative.Note: reopening encrypted arrays does not require the encryption key.
Example:
// Open the array for reading tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); array.reopen();
- Throws:
TileDBError – if the array was not already open or other error occurred.
-
inline void set_open_timestamp_start(uint64_t timestamp_start) const¶
Sets the inclusive starting timestamp when opening this array.
-
inline void set_open_timestamp_end(uint64_t timestamp_end) const¶
Sets the inclusive ending timestamp when opening this array.
-
inline uint64_t open_timestamp_start() const¶
Retrieves the inclusive starting timestamp.
-
inline uint64_t open_timestamp_end() const¶
Retrieves the inclusive ending timestamp.
-
inline void set_config(const Config &config) const¶
Sets the array config.
- Pre:
The array must be closed.
-
inline void close()¶
Closes the array. The destructor calls this automatically if the underlying pointer is owned.
Example:
tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); array.close();
-
template<typename T>
inline std::vector<std::pair<std::string, std::pair<T, T>>> non_empty_domain()¶ Retrieves the non-empty domain from the array. This is the union of the non-empty domains of the array fragments.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); // Specify the domain type (example uint32_t) auto non_empty = array.non_empty_domain<uint32_t>(); std::cout << "Dimension named " << non_empty[0].first << " has cells in [" << non_empty[0].second.first << ", " non_empty[0].second.second << "]" << std::endl;
- Template Parameters:
T – Domain datatype
- Returns:
Vector of dim names with a {lower, upper} pair. Inclusive. Empty vector if the array has no data.
-
template<typename T>
inline std::pair<T, T> non_empty_domain(unsigned idx)¶ Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); // Specify the dimension type (example uint32_t) auto non_empty = array.non_empty_domain<uint32_t>(0);
- Template Parameters:
T – Dimension datatype
- Parameters:
idx – The dimension index.
- Returns:
The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.
-
template<typename T>
inline std::pair<T, T> non_empty_domain(const std::string &name)¶ Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); // Specify the dimension type (example uint32_t) auto non_empty = array.non_empty_domain<uint32_t>("d1");
- Template Parameters:
T – Dimension datatype
- Parameters:
name – The dimension name.
- Returns:
The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.
-
inline std::pair<std::string, std::string> non_empty_domain_var(unsigned idx)¶
Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments. Applicable only to var-sized dimensions.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); // Specify the dimension type (example uint32_t) auto non_empty = array.non_empty_domain_var(0);
- Parameters:
idx – The dimension index.
- Returns:
The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.
-
inline std::pair<std::string, std::string> non_empty_domain_var(const std::string &name)¶
Retrieves the non-empty domain from the array on the given dimension. This is the union of the non-empty domains of the array fragments. Applicable only to var-sized dimensions.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, "s3://bucket-name/array-name", TILEDB_READ); // Specify the dimension type (example uint32_t) auto non_empty = array.non_empty_domain_var("d1");
- Parameters:
name – The dimension name.
- Returns:
The {lower, upper} pair of the non-empty domain (inclusive) on the input dimension.
-
inline tiledb_query_type_t query_type() const¶
Returns the query type the array was opened with.
-
inline void put_metadata(const std::string &key, tiledb_datatype_t value_type, uint32_t value_num, const void *value)¶
It puts a metadata key-value item to an open array. The array must be opened in WRITE mode, otherwise the function will error out.
Note
The writes will take effect only upon closing the array.
- Parameters:
key – The key of the metadata item to be added. UTF-8 encodings are acceptable.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata.
value – The metadata value in binary form.
-
inline void delete_metadata(const std::string &key)¶
It deletes a metadata key-value item from an open array. The array must be opened in WRITE mode, otherwise the function will error out.
Note
The writes will take effect only upon closing the array.
Note
If the key does not exist, this will take no effect (i.e., the function will not error out).
- Parameters:
key – The key of the metadata item to be deleted.
-
inline void get_metadata(const std::string &key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)¶
It gets a metadata key-value item from an open array. The array must be opened in READ mode, otherwise the function will error out.
Note
If the key does not exist, then
value
will be NULL.- Parameters:
key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.
value – The metadata value in binary form.
-
inline bool has_metadata(const std::string &key, tiledb_datatype_t *value_type)¶
Checks if key exists in metadata from an open array. The array must be opened in READ mode, otherwise the function will error out.
Note
If the key does not exist, then
value_type
will not be modified.- Parameters:
key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.
value_type – The datatype of the value associated with the key (if any).
- Returns:
true if the key exists, else false.
-
inline uint64_t metadata_num() const¶
Returns then number of metadata items in an open array. The array must be opened in READ mode, otherwise the function will error out.
-
inline void get_metadata_from_index(uint64_t index, std::string *key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)¶
It gets a metadata item from an open array using an index. The array must be opened in READ mode, otherwise the function will error out.
- Parameters:
index – The index used to get the metadata.
key – The metadata key.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.
value – The metadata value in binary form.
Public Static Functions
-
static inline void delete_array(const Context &ctx, const std::string &uri)¶
Deletes all data written to the array with the input uri.
- Parameters:
ctx – TileDB context
uri – The Array’s URI
- Post:
This is destructive; the array may not be reopened after delete.
-
static inline void delete_fragments(const Context &ctx, const std::string &uri, uint64_t timestamp_start, uint64_t timestamp_end)¶
Deletes the fragments written between the input timestamps of an array with the input uri.
- Parameters:
ctx – TileDB context
uri – The URI of the fragments’ parent Array.
timestamp_start – The epoch start timestamp in milliseconds.
timestamp_end – The epoch end timestamp in milliseconds. Use UINT64_MAX for the current timestamp.
-
static inline void delete_fragments_list(const Context &ctx, const std::string &uri, const char *fragment_uris[], const size_t num_fragments)¶
Deletes the fragments with the input uris on an array with the input uri.
-
static inline void consolidate(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶
Consolidates the fragments of an array into a single fragment.
You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).
Example:
tiledb::Array::consolidate(ctx, "s3://bucket-name/array-name");
- Parameters:
ctx – TileDB context
array_uri – The URI of the TileDB array to be consolidated.
config – Configuration parameters for the consolidation.
-
static inline void consolidate(const Context &ctx, const std::string &array_uri, const char *fragment_uris[], const size_t num_fragments, Config *const config = nullptr)¶
Consolidates the fragments with the input uris into a single fragment.
You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).
Example:
const char* fragment_uris[2] = { "__1712657401931_1712657401931_285cf8a0eff4df875a04cfbea96d5c00_21", "__1712657401948_1712657401948_285cf8a0efdsafas6a5a04cfbesajads_21"}; tiledb::Array::consolidate( ctx, "s3://bucket-name/array-name", fragment_uris, 2, config);
- Parameters:
ctx – TileDB context
array_uri – The URI of the TileDB array to be consolidated.
fragment_uris – Fragment names of the fragments to consolidate. The names can be recovered using tiledb_fragment_info_get_fragment_name_v2.
num_fragments – The number of fragments to consolidate.
config – Configuration parameters for the consolidation.
-
static inline void vacuum(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶
Cleans up the array, such as consolidated fragments and array metadata. Note that this will coarsen the granularity of time traveling (see docs for more information).
Example:
tiledb::Array::vacuum(ctx, "s3://bucket-name/array-name");
- Parameters:
ctx – TileDB context
array_uri – The URI of the TileDB array to be vacuumed.
config – Configuration parameters for the vacuuming.
-
static inline void create(const std::string &uri, const ArraySchema &schema)¶
Creates a new TileDB array given an input schema.
Example:
tiledb::Array::create("s3://bucket-name/array-name", schema);
- Parameters:
uri – URI where array will be created.
schema – The array schema.
-
static inline ArraySchema load_schema(const Context &ctx, const std::string &uri)¶
Loads the array schema from an array.
Example:
auto schema = tiledb::Array::load_schema(ctx, "s3://bucket-name/array-name");
- Parameters:
ctx – The TileDB context.
uri – The array URI.
- Returns:
The loaded ArraySchema object.
-
static inline tiledb_encryption_type_t encryption_type(const Context &ctx, const std::string &array_uri)¶
Gets the encryption type the given array was created with.
Example:
tiledb_encryption_type_t enc_type; tiledb::Array::encryption_type(ctx, "s3://bucket-name/array-name", &enc_type);
- Parameters:
ctx – TileDB context
array_uri – The URI of the TileDB array to be consolidated.
encryption_type – Set to the encryption type of the array.
-
static inline void consolidate_metadata(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶
Consolidates the metadata of an array.
You must first finalize all queries to the array before consolidation can begin (as consolidation temporarily acquires an exclusive lock on the array).
Example:
tiledb::Array::consolidate_metadata(ctx, "s3://bucket-name/array-name");
- Parameters:
ctx – TileDB context
array_uri – The URI of the TileDB array whose metadata will be consolidated.
config – Configuration parameters for the consolidation.
-
static inline void upgrade_version(const Context &ctx, const std::string &array_uri, Config *const config = nullptr)¶
Upgrades an array to the latest format version.
Example:
tiledb::Array::upgrade_version(ctx, "array_name");
- Parameters:
ctx – TileDB context
array_uri – The URI of the TileDB array to be upgraded.
config – Configuration parameters for the upgrade.
-
inline Array(const Context &ctx, const std::string &array_uri, tiledb_query_type_t query_type, const TemporalPolicy temporal_policy = {}, const EncryptionAlgorithm encryption_algorithm = {})¶
Query¶
-
class Query¶
Construct and execute read/write queries on a tiledb::Array.
See examples for more usage details.
Example:
// Open the array for writing tiledb::Context ctx; tiledb::Array array(ctx, "my_dense_array", TILEDB_WRITE); Query query(ctx, array); query.set_layout(TILEDB_GLOBAL_ORDER); std::vector a1_data = {1, 2, 3}; query.set_data_buffer("a1", a1_data); query.submit(); query.finalize(); array.close();
Public Types
-
enum class Status¶
The query or query attribute status.
Values:
Public Functions
-
inline Query(const Context &ctx, const Array &array, tiledb_query_type_t type)¶
Creates a TileDB query object.
The query type (read or write) must be the same as the type used to open the array object.
The storage manager also acquires a shared lock on the array. This means multiple read and write queries to the same array can be made concurrently (in TileDB, only consolidation requires an exclusive lock for a short period of time).
Example:
// Open the array for writing tiledb::Context ctx; tiledb::Array array(ctx, "my_array", TILEDB_WRITE); tiledb::Query query(ctx, array, TILEDB_WRITE);
- Parameters:
ctx – TileDB context
array – Open Array object
type – The TileDB query type
-
inline Query(const Context &ctx, const Array &array)¶
Creates a TileDB query object.
The query type (read or write) is inferred from the array object, which was opened with a specific query type.
The storage manager also acquires a shared lock on the array. This means multiple read and write queries to the same array can be made concurrently (in TileDB, only consolidation requires an exclusive lock for a short period of time).
Example:
// Open the array for writing tiledb::Context ctx; tiledb::Array array(ctx, "my_array", TILEDB_WRITE); Query query(ctx, array); // Equivalent to: // Query query(ctx, array, TILEDB_WRITE);
- Parameters:
ctx – TileDB context
array – Open Array object
-
inline std::shared_ptr<tiledb_query_t> ptr() const¶
Returns a shared pointer to the C TileDB query object.
-
inline tiledb_query_type_t query_type() const¶
Returns the query type (read or write).
-
inline Query &set_layout(tiledb_layout_t layout)¶
Sets the layout of the cells to be written or read.
- Parameters:
layout – For a write query, this specifies the order of the cells provided by the user in the buffers. For a read query, this specifies the order of the cells that will be retrieved as results and stored in the user buffers. The layout can be one of the following:
TILEDB_COL_MAJOR
: This means column-major order with respect to the subarray.TILEDB_ROW_MAJOR
: This means row-major order with respect to the subarray.TILEDB_GLOBAL_ORDER
: This means that cells are stored or retrieved in the array global cell order.TILEDB_UNORDERED
: This is applicable only to writes for sparse arrays, or for sparse writes to dense arrays. It specifies that the cells are unordered and, hence, TileDB must sort the cells in the global cell order prior to writing.
- Returns:
Reference to this Query
-
inline tiledb_layout_t query_layout() const¶
Returns the layout of the query.
-
inline Query &set_condition(const QueryCondition &condition)¶
Sets the read query condition.
Note that only one query condition may be set on a query at a time. This overwrites any previously set query condition. To apply more than one condition at a time, use the
QueryCondition::combine
API to construct a single object.- Parameters:
condition – The query condition object.
- Returns:
Reference to this Query
-
inline bool has_results() const¶
Returns
true
if the query has results. Applicable only to read queries (it returnsfalse
for write queries).
-
inline Status submit()¶
Submits the query. Call will block until query is complete.
Note
finalize()
must be invoked after finish writing in global layout (via repeated invocations ofsubmit()
), in order to flush any internal state. For the case of reads, if the returned status isTILEDB_INCOMPLETE
, TileDB could not fit the entire result in the user’s buffers. In this case, the user should consume the read results (if any), optionally reset the buffers withset_data_buffer()
, and then resubmit the query until the status becomesTILEDB_COMPLETED
. If all buffer sizes after the termination of this function become 0, then this means that no useful data was read into the buffers, implying that the larger buffers are needed for the query to proceed. In this case, the users must reallocate their buffers (increasing their size), reset the buffers withset_data_buffer()
, and resubmit the query.- Returns:
Query status
-
inline void finalize()¶
Flushes all internal state of a query object and finalizes the query. This is applicable only to global layout writes. It has no effect for any other query type.
-
inline void submit_and_finalize()¶
Submits and finalizes the last tile of a global order write. For remote TileDB arrays, this is optimized to use only one request to perform both the submit and finalize.
-
inline std::unordered_map<std::string, std::pair<uint64_t, uint64_t>> result_buffer_elements() const¶
Returns the number of elements in the result buffers from a read query. This is a map from the attribute name to a pair of values.
The first is number of elements (offsets) for var size attributes, and the second is number of elements in the data buffer. For fixed sized attributes (and coordinates), the first is always 0.
For variable sized attributes: the first value is the number of cells read, i.e. the number of offsets read for the attribute. The second value is the total number of elements in the data buffer. For example, a read query on a variable-length
float
attribute that reads three cells would return 3 for the first number in the pair. If the total amount offloats
read across the three cells was 10, then the second number in the pair would be 10.For fixed-length attributes, the first value is always 0. The second value is the total number of elements in the data buffer. For example, a read query on a single
float
attribute that reads three cells would return 3 for the second value. A read query on afloat
attribute with cell_val_num 2 that reads three cells would return 3 * 2 = 6 for the second value.If the query has not been submitted, an empty map is returned.
Example:
// Submit a read query. query.submit(); auto result_el = query.result_buffer_elements(); // For fixed-sized attributes, `.second` is the number of elements // that were read for the attribute across all cells. Note: number of // elements and not number of bytes. auto num_a1_elements = result_el["a1"].second; // Coords are also fixed-sized. auto num_coords = result_el["__coords"].second; // In variable attributes, e.g. std::string type, need two buffers, // one for offsets and one for cell data ("elements"). auto num_a2_offsets = result_el["a2"].first; auto num_a2_elements = result_el["a2"].second;
-
inline std::unordered_map<std::string, std::tuple<uint64_t, uint64_t, uint64_t>> result_buffer_elements_nullable() const¶
Returns the number of elements in the result buffers from a read query. This is a map from the attribute name to a tuple of values.
The first is number of elements (offsets) for var size attributes, and the second is number of elements in the data buffer. For fixed sized attributes (and coordinates), the first is always 0. The third element is the size of the validity bytemap buffer.
For variable sized attributes: the first value is the number of cells read, i.e. the number of offsets read for the attribute. The second value is the total number of elements in the data buffer. For example, a read query on a variable-length
float
attribute that reads three cells would return 3 for the first number in the pair. If the total amount offloats
read across the three cells was 10, then the second number in the pair would be 10.For fixed-length attributes, the first value is always 0. The second value is the total number of elements in the data buffer. For example, a read query on a single
float
attribute that reads three cells would return 3 for the second value. A read query on afloat
attribute with cell_val_num 2 that reads three cells would return 3 * 2 = 6 for the second value.If the query has not been submitted, an empty map is returned.
Example:
// Submit a read query. query.submit(); auto result_el = query.result_buffer_elements_nullable(); // For fixed-sized attributes, the second tuple element is the number of // elements that were read for the attribute across all cells. Note: number // of elements and not number of bytes. auto num_a1_elements = std::get<1>(result_el["a1"]); // In variable attributes, e.g. std::string type, need two buffers, // one for offsets and one for cell data ("elements"). auto num_a2_offsets = std::get<0>(result_el["a2"]); auto num_a2_elements = std::get<1>(result_el["a2"]); // For both fixed-size and variable-sized attributes, the third tuple // element is the number of elements in the validity bytemap. auto num_a1_validity_values = std::get<2>(result_el["a1"]); auto num_a2_validity_values = std::get<2>(result_el["a2"]);
-
inline uint64_t est_result_size(const std::string &attr_name) const¶
Retrieves the estimated result size for a fixed-size attribute. This is an estimate and may not be sufficient to read all results for the requested range, for sparse arrays or array with var-length attributes. Query status must be checked and resubmitted if not complete.
Example:
uint64_t est_size = query.est_result_size("attr1");
- Parameters:
attr_name – The attribute name.
- Returns:
The estimated size in bytes.
-
inline std::array<uint64_t, 2> est_result_size_var(const std::string &attr_name) const¶
Retrieves the estimated result size for a variable-size attribute. This is an estimate and may not be sufficient to read all results for the requested ranges, for sparse arrays or any array with var-length attributes. Query status must be checked and resubmitted if not complete.
Example:
std::array<uint64_t, 2> est_size = query.est_result_size_var("attr1");
- Parameters:
attr_name – The attribute name.
- Returns:
An array with first element containing the estimated size of the result offsets in bytes, and second element containing the estimated size of the result values in bytes.
-
inline std::array<uint64_t, 2> est_result_size_nullable(const std::string &attr_name) const¶
Retrieves the estimated result size for a fixed-size, nullable attribute. This is an estimate and may not be sufficient to read all results for the requested ranges, for sparse arrays or any array with var-length attributes. Query status must be checked and resubmitted if not complete.
Example:
std::array<uint64_t, 2> est_size = query.est_result_size_nullable("attr1");
- Parameters:
attr_name – The attribute name.
- Returns:
An array with first element containing the estimated size of the result values in bytes, and second element containing the estimated size of the result validity values in bytes.
-
inline std::array<uint64_t, 3> est_result_size_var_nullable(const std::string &attr_name) const¶
Retrieves the estimated result size for a variable-size, nullable attribute.
Example:
std::array<uint64_t, 3> est_size = query.est_result_size_var_nullable("attr1");
- Parameters:
attr_name – The attribute name.
- Returns:
An array with first element containing the estimated size of the offset values in bytes, second element containing the estimated size of the result values in bytes, and the third element containing the estimated size of the validity values in bytes.
-
inline uint32_t fragment_num() const¶
Returns the number of written fragments. Applicable only to WRITE queries.
-
inline std::string fragment_uri(uint32_t idx) const¶
Returns the URI of the written fragment with the input index. Applicable only to WRITE queries.
-
inline std::pair<uint64_t, uint64_t> fragment_timestamp_range(uint32_t idx) const¶
Returns the timestamp range of the written fragment with the input index. Applicable only to WRITE queries.
-
inline Query &set_subarray(const Subarray &subarray)¶
Prepare a query with the contents of a subarray.
- Parameters:
subarray – The subarray to be used to prepare the query.
-
inline Query &set_config(const Config &config)¶
Set the query config.
Setting the query config will also set the subarray configuration in order to maintain existing behavior. If you wish the subarray to have a different configuration than the query, set it after calling Query::set_config.
Setting configuration with this function overrides the following Query-level parameters only:
sm.memory_budget
sm.memory_budget_var
sm.var_offsets.mode
sm.var_offsets.extra_element
sm.var_offsets.bitsize
sm.check_coord_dups
sm.check_coord_oob
sm.check_global_order
sm.dedup_coords
-
template<typename T>
inline Query &set_data_buffer(const std::string &name, T *buff, uint64_t nelements)¶ Sets the data for a fixed/var-sized attribute/dimension.
The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds values to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain data read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_WRITE); int data_a1[] = {0, 1, 2, 3}; Query query(ctx, array); query.set_data_buffer("a1", data_a1, 4);
Note
set_data_buffer(std::string, std::vector) is preferred as it is safer.
- Template Parameters:
T – Attribute/Dimension value type
- Parameters:
name – Attribute/Dimension name
buff – Buffer array pointer with elements of the attribute/dimension type.
nelements – Number of array elements
-
template<typename T>
inline Query &set_data_buffer(const std::string &name, std::vector<T> &buf)¶ Sets the data for a fixed/var-sized attribute/dimension.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_WRITE); std::vector<int> data_a1 = {0, 1, 2, 3}; Query query(ctx, array); query.set_data_buffer("a1", data_a1);
- Template Parameters:
T – Attribute/Dimension value type
- Parameters:
name – Attribute/Dimension name
buf – Buffer vector with elements of the attribute/dimension type.
-
inline Query &set_data_buffer(const std::string &name, void *buff, uint64_t nelements)¶
Sets the data for a fixed/var-sized attribute/dimension.
The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds values to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain data read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.
Note
This unsafe version does not perform type checking; the given buffer is assumed to be the correct type, and the size of an element in the given buffer is assumed to be the size of the datatype of the attribute.
- Parameters:
name – Attribute/Dimension name
buff – Buffer array pointer with elements of the attribute type.
nelements – Number of array elements in buffer
-
inline Query &set_data_buffer(const std::string &name, std::string &data)¶
Sets the data for a fixed/var-sized attribute/dimension.
- Parameters:
name – Attribute/Dimension name
data – Pre-allocated string buffer.
-
inline Query &set_offsets_buffer(const std::string &attr, uint64_t *offsets, uint64_t offset_nelements)¶
Sets the offset buffer for a var-sized attribute/dimension.
The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds offsets to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain offset data read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_WRITE); uint64_t offsets_a1[] = {0, 8}; Query query(ctx, array); query.set_offsets_buffer("a1", offsets_a1, 2);
Note
set_offsets_buffer(std::string, std::vector, std::vector) is preferred as it is safer.
- Parameters:
attr – Attribute/Dimension name
offsets – Offsets array pointer where a new element begins in the data buffer.
offsets_nelements – Number of elements in offsets buffer.
-
inline Query &set_offsets_buffer(const std::string &name, std::vector<uint64_t> &offsets)¶
Sets the offset buffer for a var-sized attribute/dimension.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_WRITE); std::vector<uint64_t> offsets_a1 = {0, 8}; Query query(ctx, array); query.set_offsets_buffer("a1", offsets_a1);
- Parameters:
name – Attribute/Dimension name
offsets – Offsets where a new element begins in the data buffer.
-
inline Query &set_validity_buffer(const std::string &attr, uint8_t *validity_bytemap, uint64_t validity_bytemap_nelements)¶
Sets the validity buffer for nullable attribute/dimension.
The caller owns the buffer provided and is responsible for freeing the memory associated with it. For writes, the buffer holds validity values to be written which can be freed at any time after query completion. For reads, the buffer is allocated by the caller and will contain the validity map read by the query after completion. The freeing of this memory is up to the caller once they are done referencing the read data.
-
inline Query &set_validity_buffer(const std::string &name, std::vector<uint8_t> &validity_bytemap)¶
Sets the validity buffer for nullable attribute/dimension.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_WRITE); std::vector<uint8_t> validity_bytemap = {1, 1, 0, 1}; Query query(ctx, array); query.set_validity_buffer("a1", validity_bytemap);
- Parameters:
name – Attribute name
validity_bytemap – Buffer vector with elements of the attribute validity values.
-
inline Query &get_data_buffer(const std::string &name, void **data, uint64_t *data_nelements, uint64_t *element_size)¶
Retrieves the data buffer of a fixed/var-sized attribute/dimension.
- Parameters:
name – Attribute/dimension name
data – Buffer array pointer with elements of the attribute type.
data_nelements – Number of array elements.
element_size – Size of array elements (in bytes).
-
inline Query &get_offsets_buffer(const std::string &name, uint64_t **offsets, uint64_t *offsets_nelements)¶
Retrieves the offset buffer for a var-sized attribute/dimension.
- Parameters:
name – Attribute/dimension name
offsets – Offsets array pointer with elements of uint64_t type.
offsets_nelements – Number of array elements.
-
inline Query &get_validity_buffer(const std::string &name, uint8_t **validity_bytemap, uint64_t *validity_bytemap_nelements)¶
Retrieves the validity buffer for a nullable attribute/dimension.
- Parameters:
name – Attribute name
validity_bytemap – Buffer array pointer with elements of the attribute validity values.
validity_bytemap_nelements – Number of validity bytemap elements.
-
inline std::string stats()¶
Returns a JSON-formatted string of the stats.
Public Static Functions
-
static inline Status to_status(const tiledb_query_status_t &status)¶
Converts the TileDB C query status to a C++ query status.
-
static inline std::string to_str(tiledb_query_type_t type)¶
Converts the TileDB C query type to a string representation.
-
enum class Status¶
QueryCondition¶
-
class QueryCondition¶
Public Functions
-
inline QueryCondition(const Context &ctx)¶
Creates a TileDB query condition object.
- Parameters:
ctx – TileDB context.
-
QueryCondition(const QueryCondition&) = default¶
Copy constructor.
-
QueryCondition(QueryCondition&&) = default¶
Move constructor.
-
~QueryCondition() = default¶
Destructor.
-
inline QueryCondition(const Context &ctx, tiledb_query_condition_t *const qc)¶
Constructs an instance directly from a C-API query condition object.
- Parameters:
ctx – The TileDB context.
qc – The C-API query condition object.
-
QueryCondition &operator=(const QueryCondition&) = default¶
Copy-assignment operator.
-
QueryCondition &operator=(QueryCondition&&) = default¶
Move-assignment operator.
-
inline void init(const std::string &attribute_name, const void *condition_value, uint64_t condition_value_size, tiledb_query_condition_op_t op)¶
Initialize a TileDB query condition object.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, "my_array", TILEDB_READ); tiledb::Query query(ctx, array, TILEDB_READ); int cmp_value = 5; tiledb::QueryCondition qc; qc.init("a1", &cmp_value, sizeof(int), TILEDB_LT); query.set_condition(qc);
- Parameters:
ctx – TileDB context.
attribute_name – The name of the attribute to compare against.
condition_value – The fixed value to compare against.
condition_value_size – The byte size of
condition_value
.op – The comparison operation between each cell value and
condition_value
.
-
inline void init(const std::string &attribute_name, const std::string &condition_value, tiledb_query_condition_op_t op)¶
Initializes a TileDB query condition object.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, "my_array", TILEDB_READ); tiledb::Query query(ctx, array, TILEDB_READ); std::string cmp_value = "abc"; tiledb::QueryCondition qc; qc.init("a1", cmp_value, TILEDB_LT); query.set_condition(qc);
- Parameters:
ctx – TileDB context.
attribute_name – The name of the attribute to compare against.
condition_value – The fixed value to compare against.
condition_value_size – The byte size of
condition_value
.op – The comparison operation between each cell value and
condition_value
.
-
inline std::shared_ptr<tiledb_query_condition_t> ptr() const¶
Returns a shared pointer to the C TileDB query condition object.
-
inline QueryCondition combine(const QueryCondition &rhs, tiledb_query_condition_combination_op_t combination_op) const¶
Combines this instance with another instance to form a multi-clause condition object.
Example:
int qc1_cmp_value = 10; tiledb::QueryCondition qc1; qc1.init("a1", &qc1_cmp_value, sizeof(int), TILEDB_LT); int qc2_cmp_value = 3; tiledb::QueryCondition qc2; qc.init("a1", &qc2_cmp_value, sizeof(int), TILEDB_GE); tiledb::QueryCondition qc3 = qc1.combine(qc2, TILEDB_AND); query.set_condition(qc3);
- Parameters:
rhs – The right-hand-side query condition object.
combination_op – The logical combination operator that combines this instance with
rhs
.
-
inline QueryCondition negate() const¶
Return a query condition representing a negation of this query condition. Currently this is performed by applying De Morgan’s theorem recursively to the query condition’s internal representation.
Example:
int qc1_cmp_value = 10; tiledb::QueryCondition qc1; qc1.init("a1", &qc1_cmp_value, sizeof(int), TILEDB_LT); tiledb::QueryCondition qc2 = qc1.negate(); query.set_condition(qc2);
Public Static Functions
-
static inline QueryCondition create(const Context &ctx, const std::string &attribute_name, const std::string &value, tiledb_query_condition_op_t op)¶
Factory function for creating a new query condition with a string datatype.
Example:
tiledb::Context ctx; auto a1 = tiledb::QueryCondition::create(ctx, "a1", "foo", TILEDB_LE);
- Template Parameters:
T – Datatype of the attribute. Can either be arithmetic type or string.
- Parameters:
ctx – The TileDB context.
name – The attribute name.
value – The value to compare against.
op – The comparison operator.
- Returns:
A new QueryCondition object.
-
template<typename T>
static inline QueryCondition create(const Context &ctx, const std::string &attribute_name, T value, tiledb_query_condition_op_t op)¶ Factory function for creating a new query condition with datatype T.
Example:
tiledb::Context ctx; auto a1 = tiledb::QueryCondition::create<int>(ctx, "a1", 5, TILEDB_LE); auto a2 = tiledb::QueryCondition::create<float>(ctx, "a3", 3.5, TILEDB_GT); auto a3 = tiledb::QueryCondition::create<double>(ctx, "a4", 10.0, TILEDB_LT);
- Template Parameters:
T – Datatype of the attribute. Can either be arithmetic type or string.
- Parameters:
ctx – The TileDB context.
name – The attribute name.
value – The value to compare against.
op – The comparison operator.
- Returns:
A new QueryCondition object.
-
inline QueryCondition(const Context &ctx)¶
Subarray¶
-
class Subarray¶
Construct and support manipulation of a possibly multiple-range subarray for optional use with Query object operations.
See examples for more usage details.
Example:
// Open the array for writing tiledb::Context ctx; tiledb::Array array(ctx, "my_dense_array", TILEDB_WRITE); Query query(ctx, array); std::vector a1_data = {1, 2, 3}; query.set_buffer("a1", a1_data); tiledb::Subarray subarray(ctx, array); subarray.set_layout(TILEDB_GLOBAL_ORDER); std::vector<int32_t> subarray_indices = {1, 2}; subarray.add_range(0, subarray_indices[0], subarray_indices[1]); query.set_subarray(subarray); query.submit(); query.finalize(); array.close();
Public Functions
-
inline Subarray(const tiledb::Context &ctx, const tiledb::Array &array, bool coalesce_ranges = true)¶
Creates a TileDB Subarray object.
Example:
// Open the array for writing tiledb::Context ctx; tiledb::Array array(ctx, "my_array", TILEDB_WRITE); tiledb::Subarray subarray(ctx, array);
- Parameters:
ctx – TileDB context
array – Open Array object
coalesce_ranges – When enabled, ranges will attempt to coalesce with existing ranges as they are added.
-
inline Subarray &set_coalesce_ranges(bool coalesce_ranges)¶
Set the coalesce_ranges flag for the subarray.
-
inline Subarray &replace_subarray_data(tiledb_subarray_t *capi_subarray)¶
Replace/update -this- Subarray’s shared_ptr to data to reference the passed subarray.
- Parameters:
capi_subarray – is a c_api subarray to be referenced by this cpp_api subarray entity.
-
template<class T>
inline Subarray &add_range(uint32_t dim_idx, T start, T end, T stride = 0)¶ Adds a 1D range along a subarray dimension index, in the form (start, end, stride). The datatype of the range must be the same as the dimension datatype.
Example:
// Set a 1D range on dimension 0, assuming the domain type is int64. int64_t start = 10; int64_t end = 20; // Stride is optional subarray.add_range(0, start, end);
- Template Parameters:
T – The dimension datatype.
- Parameters:
dim_idx – The index of the dimension to add the range to.
start – The range start to add.
end – The range end to add.
stride – The range stride to add.
- Returns:
Reference to this Subarray.
-
template<class T>
inline Subarray &add_range(const std::string &dim_name, T start, T end, T stride = 0)¶ Adds a 1D range along a subarray dimension name, specified by its name, in the form (start, end, stride). The datatype of the range must be the same as the dimension datatype.
Example:
// Set a 1D range on dimension "rows", assuming the domain type is int64. int64_t start = 10; int64_t end = 20; const std::string dim_name = "rows"; // Stride is optional subarray.add_range(dim_name, start, end);
- Template Parameters:
T – The dimension datatype.
- Parameters:
dim_name – The name of the dimension to add the range to.
start – The range start to add.
end – The range end to add.
stride – The range stride to add.
- Returns:
Reference to this Subarray.
-
inline Subarray &add_range(uint32_t dim_idx, const std::string &start, const std::string &end)¶
Adds a 1D string range along a subarray dimension index, in the form (start, end). Applicable only to variable-sized dimensions
Example:
// Set a 1D range on dimension 0, assuming the domain type is int64. int64_t start = 10; int64_t end = 20; // Stride is optional subarray.add_range(0, start, end);
- Template Parameters:
T – The dimension datatype.
- Parameters:
dim_idx – The index of the dimension to add the range to.
start – The range start to add.
end – The range end to add.
- Returns:
Reference to this Subarray.
-
inline Subarray &add_range(const std::string &dim_name, const std::string &start, const std::string &end)¶
Adds a 1D string range along a subarray dimension name, in the form (start, end). Applicable only to variable-sized dimensions
Example:
// Set a 1D range on dimension "rows", assuming the domain type is int64. int64_t start = 10; int64_t end = 20; const std::string dim_name = "rows"; // Stride is optional subarray.add_range(dim_name, start, end);
- Template Parameters:
T – The dimension datatype.
- Parameters:
dim_name – The name of the dimension to add the range to.
start – The range start to add.
end – The range end to add.
- Returns:
Reference to this Subarray.
-
template<typename T = uint64_t>
inline Subarray &set_subarray(const T *pairs, uint64_t size)¶ Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_READ); int subarray_vals[] = {0, 3, 0, 3}; Subarray subarray(ctx, array); subarray.set_subarray(subarray_vals, 4);
Note
set_subarray(std::vector<T>)
is preferred as it is safer.Note
The number of pairs passed should equal number of dimensions of the array associated with the subarray, or the number of elements in subarray_vals should equal that number of dimensions * 2.
- Template Parameters:
T – Type of array domain.
- Parameters:
pairs – Subarray pointer defined as an array of [start, stop] values per dimension.
size – The number of subarray elements.
-
inline Subarray &set_config(const Config &config)¶
Set the subarray config.
Setting configuration with this function overrides the following Subarray-level parameters only:
sm.read_range_oob
-
template<typename Vec>
inline Subarray &set_subarray(const Vec &pairs)¶ Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_READ); std::vector<int> subarray_vals = {0, 3, 0, 3}; Subarray subarray(ctx, array); subarray.set_subarray(subarray_vals);
- Template Parameters:
Vec – Vector datatype. Should always be a vector of the domain type.
- Parameters:
pairs – The subarray defined as a vector of [start, stop] coordinates per dimension.
-
template<typename T = uint64_t>
inline Subarray &set_subarray(const std::initializer_list<T> &l)¶ Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive. For the case of writes, this is meaningful only for dense arrays, and specifically dense writes.
Example:
tiledb::Context ctx; tiledb::Array array(ctx, array_name, TILEDB_READ); Subarray subarray(ctx, array); subarray.set_subarray({0, 3, 0, 3});
- Template Parameters:
T – Type of array domain.
- Parameters:
pairs – List of [start, stop] coordinates per dimension.
-
template<typename T = uint64_t>
inline Subarray &set_subarray(const std::vector<std::array<T, 2>> &pairs)¶ Sets a subarray, defined in the order dimensions were added. Coordinates are inclusive.
Note
set_subarray(std::vector) is preferred and avoids an extra copy.
- Template Parameters:
T – Type of array domain.
- Parameters:
pairs – The subarray defined as pairs of [start, stop] per dimension.
-
inline uint64_t range_num(unsigned dim_idx) const¶
Retrieves the number of ranges for a given dimension index.
Example:
unsigned dim_idx = 0; uint64_t range_num = subarray.range_num(dim_idx);
- Parameters:
dim_idx – The dimension index.
- Returns:
The number of ranges.
-
inline uint64_t range_num(const std::string &dim_name) const¶
Retrieves the number of ranges for a given dimension name.
Example:
unsigned dim_name = "rows"; uint64_t range_num = subarray.range_num(dim_name);
- Parameters:
dim_name – The dimension name.
- Returns:
The number of ranges.
-
template<class T>
inline std::array<T, 3> range(unsigned dim_idx, uint64_t range_idx)¶ Retrieves a range for a given dimension index and range id. The template datatype must be the same as that of the underlying array.
Example:
unsigned dim_idx = 0; unsigned range_idx = 0; auto range = subarray.range<int32_t>(dim_idx, range_idx);
- Template Parameters:
T – The dimension datatype.
- Parameters:
dim_idx – The dimension index.
range_idx – The range index.
- Returns:
A triplet of the form (start, end, stride).
-
template<class T>
inline std::array<T, 3> range(const std::string &dim_name, uint64_t range_idx)¶ Retrieves a range for a given dimension name and range id. The template datatype must be the same as that of the underlying array.
Example:
unsigned dim_name = "rows"; unsigned range_idx = 0; auto range = subarray.range<int32_t>(dim_name, range_idx);
- Template Parameters:
T – The dimension datatype.
- Parameters:
dim_name – The dimension name.
range_idx – The range index.
- Returns:
A triplet of the form (start, end, stride).
-
inline std::array<std::string, 2> range(unsigned dim_idx, uint64_t range_idx)¶
Retrieves a range for a given variable length string dimension index and range id.
Example:
unsigned dim_idx = 0; unsigned range_idx = 0; std::array<std::string, 2> range = subarray.range(dim_idx, range_idx);
- Parameters:
dim_idx – The dimension index.
range_idx – The range index.
- Returns:
A pair of the form (start, end).
-
inline std::array<std::string, 2> range(const std::string &dim_name, uint64_t range_idx)¶
Retrieves a range for a given variable length string dimension name and range id.
Example:
unsigned dim_name = "rows"; unsigned range_idx = 0; std::array<std::string, 2> range = subarray.range(dim_name, range_idx);
- Parameters:
dim_name – The dimension name.
range_idx – The range index.
- Returns:
A pair of the form (start, end).
-
inline std::shared_ptr<tiledb_subarray_t> ptr() const¶
Returns the C TileDB subarray object.
-
inline Subarray(const tiledb::Context &ctx, const tiledb::Array &array, bool coalesce_ranges = true)¶
Filter¶
-
class Filter¶
Represents a filter. A filter is used to transform attribute data e.g. with compression, delta encoding, etc.
Example:
tiledb::Context ctx; tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD); int level = 5; f.set_option(TILEDB_COMPRESSION_LEVEL, &level);
Public Functions
-
inline Filter(const Context &ctx, tiledb_filter_type_t filter_type)¶
Creates a Filter of the given type.
Example:
tiledb::Context ctx; tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD);
- Parameters:
ctx – TileDB context
filter_type – Enumerated type of filter
-
inline Filter(const Context &ctx, tiledb_filter_t *filter)¶
Creates a Filter with the input C object.
- Parameters:
ctx – TileDB context
filter – C API filter object
-
inline std::shared_ptr<tiledb_filter_t> ptr() const¶
Returns a shared pointer to the C TileDB domain object.
-
template<typename T, typename std::enable_if_t<!std::is_pointer_v<T>, int> = 0>
inline Filter &set_option(tiledb_filter_option_t option, T value)¶ Sets an option on the filter. Options are filter dependent; this function throws an error if the given option is not valid for the given filter.
Example:
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD); f.set_option(TILEDB_COMPRESSION_LEVEL, 5);
- Template Parameters:
T – Type of value of option to set.
- Parameters:
option – Enumerated option to set.
value – Value of option to set.
- Throws:
TileDBError – if the option cannot be set on the filter.
std::invalid_argument – if the option value is the wrong type.
- Returns:
Reference to this Filter
-
inline Filter &set_option(tiledb_filter_option_t option, const void *value)¶
Sets an option on the filter. Options are filter dependent; this function throws an error if the given option is not valid for the given filter.
This version of set_option performs no type checks.
Example:
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD); int level = 5; f.set_option(TILEDB_COMPRESSION_LEVEL, &level);
Note
set_option<T>(option, T value) is preferred as it is safer.
- Parameters:
option – Enumerated option to set.
value – Value of option to set.
- Throws:
TileDBError – if the option cannot be set on the filter.
- Returns:
Reference to this Filter
-
template<typename T>
inline T get_option(tiledb_filter_option_t option)¶ Gets an option value from the filter.
Example:
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD); int32_t level = f.get_option(TILEDB_COMPRESSION_LEVEL); // level == -1 (the default compression level)
- Template Parameters:
T – Type of option value to get.
- Parameters:
option – Enumerated option to get.
- Throws:
TileDBError – if the option cannot be retrieved from the filter.
std::invalid_argument – if the option value is the wrong type.
- Returns:
value Buffer that option value will be written to.
-
template<typename T, typename std::enable_if<std::is_arithmetic_v<T>>::type* = nullptr>
inline void get_option(tiledb_filter_option_t option, T *value)¶ Gets an option value from the filter.
Example:
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD); int32_t level; f.get_option(TILEDB_COMPRESSION_LEVEL, &level); // level == -1 (the default compression level)
- Template Parameters:
T – Type of option value to get.
- Parameters:
option – Enumerated option to get.
value – Buffer that option value will be written to.
- Throws:
TileDBError – if the option cannot be retrieved from the filter.
std::invalid_argument – if the option value is the wrong type.
-
inline void get_option(tiledb_filter_option_t option, void *value)¶
Gets an option value from the filter.
This version of get_option performs no type checks.
Example:
tiledb::Filter f(ctx, TILEDB_FILTER_ZSTD); int32_t level; f.get_option(TILEDB_COMPRESSION_LEVEL, &level); // level == -1 (the default compression level)
Note
The buffer pointed to by
value
must be large enough to hold the option value.Note
T value = get_option<T>(option) is preferred as it is safer.
- Parameters:
option – Enumerated option to get.
value – Buffer that option value will be written to.
- Throws:
TileDBError – if the option cannot be retrieved from the filter.
-
inline tiledb_filter_type_t filter_type() const¶
Gets the filter type of this filter.
Public Static Functions
-
static inline std::string to_str(tiledb_filter_type_t type)¶
Returns the input type in string format.
-
inline Filter(const Context &ctx, tiledb_filter_type_t filter_type)¶
Filter List¶
-
class FilterList¶
Represents an ordered list of Filters used to transform attribute data.
Example:
tiledb::Context ctx; tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2});
Public Functions
-
inline FilterList(const Context &ctx)¶
Construct a FilterList.
Example:
tiledb::Context ctx; tiledb::FilterList filter_list(ctx);
- Parameters:
ctx – TileDB context
-
inline FilterList(const Context &ctx, tiledb_filter_list_t *filter_list)¶
Creates a FilterList with the input C object.
- Parameters:
ctx – TileDB context
filter – C API filter list object
-
inline std::shared_ptr<tiledb_filter_list_t> ptr() const¶
Returns a shared pointer to the C TileDB domain object.
-
inline FilterList &add_filter(const Filter &filter)¶
Appends a filter to a filter list. Data is processed through each filter in the order the filters were added.
Example:
tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2});
- Parameters:
filter – The filter to add
- Returns:
Reference to this FilterList
-
inline Filter filter(uint32_t filter_index) const¶
Returns a copy of the Filter in this list at the given index.
Example:
tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2}); auto f = filter_list.filter(1); // f.filter_type() == TILEDB_FILTER_BZIP2
- Parameters:
filter_index – Index of filter to get
- Throws:
TileDBError – if the index is out of range
- Returns:
-
inline uint32_t max_chunk_size() const¶
Gets the maximum tile chunk size for the filter list.
- Returns:
Maximum tile chunk size
-
inline uint32_t nfilters() const¶
Returns the number of filters in this filter list.
Example:
tiledb::FilterList filter_list(ctx); filter_list.add_filter({ctx, TILEDB_FILTER_BYTESHUFFLE}) .add_filter({ctx, TILEDB_FILTER_BZIP2}); uint32_t n = filter_list.nfilters(); // n == 2
- Returns:
-
inline FilterList &set_max_chunk_size(uint32_t max_chunk_size)¶
Sets the maximum tile chunk size for the filter list.
- Parameters:
max_chunk_size – Maximum tile chunk size to set
- Returns:
Reference to this FilterList
-
inline FilterList(const Context &ctx)¶
Group¶
Object Management¶
-
class Object¶
Represents a TileDB object: array, group, key-value (map), or none (invalid).
Public Types
Public Functions
-
inline std::string to_str() const¶
Returns a string representation of the object, including its type and URI.
-
inline std::string uri() const¶
Returns the object URI.
-
inline std::optional<std::string> name() const¶
Returns the object optional Name.
Public Static Functions
-
static inline Object object(const Context &ctx, const std::string &uri)¶
Gets an Object object that encapsulates the object type of the given path.
- Parameters:
ctx – The TileDB context
uri – The path to the object.
- Returns:
An object that contains the type along with the URI.
-
inline std::string to_str() const¶
-
class ObjectIter¶
Enables listing TileDB objects in a directory or walking recursively an entire directory tree.
Example:
// List the TileDB objects in an S3 bucket. tiledb::Context ctx; tiledb::ObjectIter obj_it(ctx, "s3://bucket-name"); for (auto it = obj_it.begin(), ite = obj_it.end(); it != ite; ++it) { const tiledb::Object &obj = *it; std::cout << obj << std::endl; }
Public Functions
-
inline explicit ObjectIter(Context &ctx, const std::string &root = ".")¶
Creates an object iterator. Unless
set_recursive
is invoked, this iterator will iterate only over the children ofroot
. It will also retrieve only TileDB-related objects.Example:
// List the TileDB objects in an S3 bucket. tiledb::Context ctx; tiledb::ObjectIter obj_it(ctx, "s3://bucket-name"); for (auto it = obj_it.begin(), ite = obj_it.end(); it != ite; ++it) { const tiledb::Object &obj = *it; std::cout << obj << std::endl; }
- Parameters:
ctx – The TileDB context.
root – The root directory where the iteration will begin.
-
inline void set_iter_policy(bool group, bool array)¶
Determines whether group, array and key-value objects will be iterated on during the walk. The default (if the function is not invoked) is
true
for all objects.- Parameters:
group – If
true
, groups will be considered.array – If
true
, arrays will be considered.
-
inline void set_recursive(tiledb_walk_order_t walk_order = TILEDB_PREORDER)¶
Specifies that the iteration will be over all the directories in the tree rooted at
root_
.- Parameters:
walk_order – The walk order.
-
inline void set_non_recursive()¶
Disables recursive traversal.
Public Static Functions
-
static inline int obj_getter(const char *path, tiledb_object_t type, void *data)¶
Callback function to be used when invoking the C TileDB functions for walking through the TileDB objects in the
root_
diretory. The function retrieves the visited object and stored it in the object vectorobj_vec
.- Parameters:
path – The path of a visited TileDB object
type – The type of the visited TileDB object.
data – To be casted to the vector where the visited object will be stored.
- Returns:
If
1
then the walk should continue to the next object.
-
class iterator¶
The actual iterator implementation in this class.
-
struct ObjGetterData¶
Carries data to be passed to
obj_getter
.
-
inline explicit ObjectIter(Context &ctx, const std::string &root = ".")¶
VFS¶
-
class VFS¶
Implements a virtual filesystem that enables performing directory/file operations with a unified API on different filesystems, such as local posix/windows, HDFS, AWS S3, etc.
Public Types
-
using filebuf = impl::VFSFilebuf¶
Stream buffer for Tiledb VFS.
This is unbuffered; each read/write is directly dispatched to TileDB. As such it is recommended to issue fewer, larger, operations.
Example (write to file):
// Create the file buffer. tiledb::Context ctx; tiledb::VFS vfs(ctx); tiledb::VFS::filebuf buff(vfs); // Create new file, truncating it if it exists. buff.open("file.txt", std::ios::out); std::ostream os(&buff); if (!os.good()) throw std::runtime_error("Error opening file"); std::string str = "This will be written to the file."; os.write(str.data(), str.size()); // Alternatively: // os << str; os.flush(); buff.close();
Example (read from file):
// Create the file buffer. tiledb::Context ctx; tiledb::VFS vfs(ctx); tiledb::VFS::filebuf buff(vfs); std::string file_uri = "s3://bucket-name/file.txt"; buff.open(file_uri, std::ios::in); std::istream is(&buff); if (!is.good()) throw std::runtime_error("Error opening file); // Read all contents from the file std::string contents; auto nbytes = vfs.file_size(file_uri); contents.resize(nbytes); vfs.read((char*)contents.data(), nbytes); buff.close();
Public Functions
-
inline VFS(const Context &ctx, const Config &config)¶
Constructor.
- Parameters:
ctx – TileDB context.
config – TileDB config.
-
inline void create_bucket(const std::string &uri) const¶
Creates an object store bucket with the input URI.
-
inline void remove_bucket(const std::string &uri) const¶
Deletes an object store bucket with the input URI.
-
inline bool is_bucket(const std::string &uri) const¶
Checks if an object store bucket with the input URI exists.
-
inline void empty_bucket(const std::string &bucket) const¶
Empty an object store bucket
-
inline bool is_empty_bucket(const std::string &bucket) const¶
Check if an object store bucket is empty
-
inline void create_dir(const std::string &uri) const¶
Creates a directory with the input URI.
-
inline bool is_dir(const std::string &uri) const¶
Checks if a directory with the input URI exists.
-
inline void remove_dir(const std::string &uri) const¶
Removes a directory (recursively) with the input URI.
-
inline bool is_file(const std::string &uri) const¶
Checks if a file with the input URI exists.
-
inline void remove_file(const std::string &uri) const¶
Deletes a file with the input URI.
-
inline uint64_t dir_size(const std::string &uri) const¶
Retrieves the size of a directory with the input URI.
-
inline std::vector<std::string> ls(const std::string &uri) const¶
Retrieves the children in directory
uri
. This function is non-recursive, i.e., it focuses in one level belowuri
.
-
inline uint64_t file_size(const std::string &uri) const¶
Retrieves the size of a file with the input URI.
-
inline void move_file(const std::string &old_uri, const std::string &new_uri) const¶
Renames a TileDB file from an old URI to a new URI.
-
inline void move_dir(const std::string &old_uri, const std::string &new_uri) const¶
Renames a TileDB directory from an old URI to a new URI.
-
inline void copy_file(const std::string &old_uri, const std::string &new_uri) const¶
Copies a TileDB file from an old URI to a new URI.
-
inline void copy_dir(const std::string &old_uri, const std::string &new_uri) const¶
Copies a TileDB directory from an old URI to a new URI.
-
inline void touch(const std::string &uri) const¶
Touches a file with the input URI, i.e., creates a new empty file.
-
inline std::shared_ptr<tiledb_vfs_t> ptr() const¶
Get the underlying tiledb object
Public Static Functions
-
static inline int ls_getter(const char *path, void *data)¶
Callback function to be used when invoking the C TileDB function for getting the children of a URI. It simply adds
path
tovec
(which is casted fromdata
).- Parameters:
path – The path of a visited TileDB object
data – This will be casted to the vector that will store
path
.
- Returns:
If
1
then the walk should continue to the next object.
-
using filebuf = impl::VFSFilebuf¶
Utils¶
LICENSE¶
The MIT License
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
- Copyright
Copyright (c) 2017-2021 TileDB, Inc.
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
DESCRIPTION¶
Utils for C++ API.
-
namespace tiledb¶
Functions
-
template<typename T, typename E = typename std::vector<T>>
std::vector<E> group_by_cell(const std::vector<uint64_t> &offsets, const std::vector<T> &data, uint64_t num_offsets, uint64_t num_data)¶ Convert an (offset, data) vector pair into a single vector of vectors. Useful for “unpacking” variable-length attribute data from a read query result in offsets + data form to a vector of per-cell data.
The offsets must be given in units of bytes.
Example:
std::vector<uint64_t> offsets; std::vector<char> data; ... query.set_data_buffer("attr_name", data); query.set_offsets_buffer("attr_name", offsets); query.submit(); ... auto attr_results = query.result_buffer_elements()["attr_name"]; // cell_vals length will be equal to the number of cells read by the query. // Each element is a std::vector<char> with each cell's data for "attr_name" auto cell_vals = group_by_cell(offsets, data, attr_results.first, attr_results.second); // Reconstruct a std::string value for the first cell: std::string cell_val(cell_vals[0].data(), cell_vals[0].size());
Note
This function, and the other utility functions, copy all of the input data when constructing their return values. Thus, these may be expensive for large amounts of data.
- Template Parameters:
T – Underlying attribute datatype
E – Cell type. usually
std::vector<T>
orstd::string
. Must be constructable by{std::vector<T>::iterator, std::vector<T>::iterator}
- Parameters:
offsets – Offsets vector. This specifies the start offset in bytes of each cell in the data vector.
data – Data vector. Flat data buffer with cell contents.
num_offsets – Number of offset elements populated by query. If the entire buffer is to be grouped, pass
offsets.size()
.num_data – Number of data elements populated by query. If the entire buffer is to be grouped, pass
data.size()
.
- Returns:
std::vector<E>
-
template<typename T, typename E = typename std::vector<T>>
std::vector<E> group_by_cell(const std::pair<std::vector<uint64_t>, std::vector<T>> &buff, uint64_t num_offsets, uint64_t num_data)¶ Convert an (offset, data) vector pair into a single vector of vectors. Useful for “unpacking” variable-length attribute data from a read query result in offsets + data form to a vector of per-cell data.
The offsets must be given in units of bytes.
Example:
std::vector<uint64_t> offsets; std::vector<char> data; ... query.set_data_buffer("attr_name", data); query.set_offsets_buffer("attr_name", offsets); query.submit(); ... auto attr_results = query.result_buffer_elements()["attr_name"]; // cell_vals length will be equal to the number of cells read by the query. // Each element is a std::vector<char> with each cell's data for "attr_name" auto cell_vals = group_by_cell(std::make_pair(offsets, data), attr_results.first, attr_results.second); // Reconstruct a std::string value for the first cell: std::string cell_val(cell_vals[0].data(), cell_vals[0].size());
- Template Parameters:
T – Underlying attribute datatype
E – Cell type. usually
std::vector<T>
orstd::string
. Must be constructable by{std::vector<T>::iterator, std::vector<T>::iterator}
- Parameters:
buff – Pair of (offset_vec, data_vec) to be grouped.
num_offsets – Number of offset elements populated by query.
num_data – Number of data elements populated by query.
- Returns:
std::vector<E>
-
template<typename T, typename E = typename std::vector<T>>
std::vector<E> group_by_cell(const std::vector<uint64_t> &offsets, const std::vector<T> &data)¶ Convert a generic (offset, data) vector pair into a single vector of vectors. The offsets must be given in units of bytes.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; std::vector<uint64_t> offsets = {0, 5}; auto grouped = group_by_cell<char, std::string>(offsets, buf); // grouped.size() == 2 // grouped[0] == "abcde" // grouped[1] == "fghi"
- Template Parameters:
T – Underlying attribute datatype
E – Cell type. usually
std::vector<T>
orstd::string
. Must be constructable by{std::vector<T>::iterator, std::vector<T>::iterator}
- Parameters:
offsets – Offsets vector
data – Data vector
- Returns:
std::vector<E>
-
template<typename T, typename E = typename std::vector<T>>
std::vector<E> group_by_cell(const std::vector<T> &buff, uint64_t el_per_cell, uint64_t num_buff)¶ Convert a vector of elements into a vector of fixed-length vectors.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; auto grouped = group_by_cell(buf, 3, buf.size()); std::string grp1(grouped[0].begin(), grouped[0].end()); // "abc" std::string grp2(grouped[1].begin(), grouped[1].end()); // "def" std::string grp3(grouped[2].begin(), grouped[2].end()); // "ghi" // Throws an exception because buf.size() is not divisible by 2: // group_by_cell(buf, 2, buf.size());
- Template Parameters:
T – Underlying attribute datatype
E – Cell type. usually
std::vector<T>
orstd::string
. Must be constructable by{std::vector<T>::iterator, std::vector<T>::iterator}
- Parameters:
buff – Data buffer to group
el_per_cell – Number of elements per cell to group together
num_buff – Number of elements populated by query. To group whole buffer, pass
buff.size()
.
- Returns:
std::vector<E>
-
template<typename T, typename E = typename std::vector<T>>
std::vector<E> group_by_cell(const std::vector<T> &buff, uint64_t el_per_cell)¶ Convert a vector of elements into a vector of fixed-length vectors.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; auto grouped = group_by_cell(buf, 3); std::string grp1(grouped[0].begin(), grouped[0].end()); // "abc" std::string grp2(grouped[1].begin(), grouped[1].end()); // "def" std::string grp3(grouped[2].begin(), grouped[2].end()); // "ghi" // Throws an exception because buf.size() is not divisible by 2: // group_by_cell(buf, 2);
- Template Parameters:
T – Element type
E – Cell type. usually
std::vector<T>
orstd::string
. Must be constructable by{std::vector<T>::iterator, std::vector<T>::iterator}
- Parameters:
buff – Data buffer to group
el_per_cell – Number of elements per cell to group together
- Returns:
std::vector<E>
-
template<uint64_t N, typename T>
std::vector<std::array<T, N>> group_by_cell(const std::vector<T> &buff, uint64_t num_buff)¶ Convert a vector of elements into a vector of fixed-length arrays.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; auto grouped = group_by_cell<3>(buf, buf.size()); std::string grp1(grouped[0].begin(), grouped[0].end()); // "abc" std::string grp2(grouped[1].begin(), grouped[1].end()); // "def" std::string grp3(grouped[2].begin(), grouped[2].end()); // "ghi" // Throws an exception because buf.size() is not divisible by 2: // group_by_cell<2>(buf, buf.size());
- Template Parameters:
N – Elements per cell
T – Array element type
- Parameters:
buff – Data buffer to group
num_buff – Number of elements in buff that were populated by the query.
- Returns:
std::vector<std::array<T,N>>
-
template<uint64_t N, typename T>
std::vector<std::array<T, N>> group_by_cell(const std::vector<T> &buff)¶ Convert a vector of elements into a vector of fixed-length arrays.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; auto grouped = group_by_cell<3>(buf); std::string grp1(grouped[0].begin(), grouped[0].end()); // "abc" std::string grp2(grouped[1].begin(), grouped[1].end()); // "def" std::string grp3(grouped[2].begin(), grouped[2].end()); // "ghi" // Throws an exception because buf.size() is not divisible by 2: // group_by_cell<2>(buf);
- Template Parameters:
N – Elements per cell
T – Array element type
- Parameters:
buff – data buff to group
- Returns:
std::vector<std::array<T,N>>
-
template<typename T, typename R = typename T::value_type>
std::pair<std::vector<uint64_t>, std::vector<R>> ungroup_var_buffer(const std::vector<T> &data)¶ Unpack a vector of variable sized attributes into a data and offset buffer. The offset buffer result is in units of bytes.
Example:
std::vector<char> buf = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}; // For the sake of example, group buf into groups of 3 elements: auto grouped = group_by_cell(buf, 3); // Ungroup into offsets, data pair. auto p = ungroup_var_buffer(grouped); auto offsets = p.first; // {0, 3, 6} auto data = p.second; // {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i'}
- Template Parameters:
T – Vector type.
T::value_type
is considered the underlying data element type. Should be vector or string.R –
T::value_type
, deduced
- Parameters:
data – Data buffer to unpack
- Returns:
pair where
.first
is the offset buffer, and.second
is data buffer
-
template<typename V, typename T = typename V::value_type::value_type>
std::vector<T> flatten(const V &vec)¶ Convert a vector-of-vectors and flatten it into a single vector.
Example:
std::vector<std::string> v = {"a", "bb", "ccc"}; auto flat_v = flatten(v); std::string s(flat_v.begin(), flat_v.end()); // "abbccc" std::vector<std::vector<double>> d = {{1.2, 2.1}, {2.3, 3.2}, {3.4, 4.3}}; auto flat_d = flatten(d); // {1.2, 2.1, 2.3, 3.2, 3.4, 4.3};
- Template Parameters:
V – Container type
T – Return element type
- Parameters:
vec – Vector to flatten
- Returns:
std::vector<T>
-
namespace impl¶
Functions
-
inline void check_config_error(tiledb_error_t *err)¶
Check an error, free, and throw if there is one.
-
inline void check_config_error(tiledb_error_t *err)¶
-
template<typename T, typename E = typename std::vector<T>>
Version¶
Stats¶
-
class Stats¶
Encapsulates functionality related to internal TileDB statistics.
Example:
// Enable stats, submit a query, then dump to stdout. tiledb::Stats::enable(); query.submit(); tiledb::Stats::dump(); // Dump to a string instead. std::string str; tiledb::Stats::dump(&str);
Public Static Functions
-
static inline void enable()¶
Enables internal TileDB statistics gathering.
-
static inline void disable()¶
Disables internal TileDB statistics gathering.
-
static inline void reset()¶
Reset all internal statistics counters to 0.
-
static inline void dump(FILE *out = nullptr)¶
Dump all statistics counters to some output (e.g., file or stdout).
- Parameters:
out – The output.
-
static inline void dump(std::string *out)¶
Dump all statistics counters to a string.
- Parameters:
out – The output.
-
static inline void raw_dump(FILE *out = nullptr)¶
Dump all raw statistics counters to some output (e.g., file or stdout) as a JSON.
- Parameters:
out – The output.
-
static inline void raw_dump(std::string *out)¶
Dump all raw statistics counters to a string.
- Parameters:
out – The output.
-
static inline void enable()¶
FragmentInfo¶
-
class FragmentInfo¶
Describes fragment info objects.
Public Functions
-
inline void load() const¶
Loads the fragment info.
-
inline std::string fragment_uri(uint32_t fid) const¶
Returns the URI of the fragment with the given index.
-
inline std::string fragment_name(uint32_t fid) const¶
Returns the name of the fragment with the given index.
-
inline void get_non_empty_domain(uint32_t fid, uint32_t did, void *domain) const¶
Retrieves the non-empty domain of the fragment with the given index on the given dimension index.
-
inline void get_non_empty_domain(uint32_t fid, const std::string &dim_name, void *domain) const¶
Retrieves the non-empty domain of the fragment with the given index on the given dimension name.
-
inline std::pair<std::string, std::string> non_empty_domain_var(uint32_t fid, uint32_t did) const¶
Returns the non-empty domain of the fragment with the given index on the given dimension index. Applicable to string dimensions.
-
inline std::pair<std::string, std::string> non_empty_domain_var(uint32_t fid, const std::string &dim_name) const¶
Returns the non-empty domain of the fragment with the given index on the given dimension name. Applicable to string dimensions.
-
inline uint64_t mbr_num(uint32_t fid) const¶
Returns the number of MBRs in the fragment with the given index.
-
inline void get_mbr(uint32_t fid, uint32_t mid, uint32_t did, void *mbr) const¶
Retrieves the MBR of the fragment with the given index on the given dimension index.
-
inline void get_mbr(uint32_t fid, uint32_t mid, const std::string &dim_name, void *mbr) const¶
Retrieves the MBR of the fragment with the given index on the given dimension name.
-
inline std::pair<std::string, std::string> mbr_var(uint32_t fid, uint32_t mid, uint32_t did) const¶
Returns the MBR of the fragment with the given index on the given dimension index. Applicable to string dimensions.
-
inline std::pair<std::string, std::string> mbr_var(uint32_t fid, uint32_t mid, const std::string &dim_name) const¶
Returns the MBR of the fragment with the given index on the given dimension name. Applicable to string dimensions.
-
inline uint32_t fragment_num() const¶
Returns the number of fragments.
-
inline uint64_t fragment_size(uint32_t fid) const¶
Returns the size of the fragment with the given index.
-
inline bool dense(uint32_t fid) const¶
Returns true if the fragment with the given index is dense.
-
inline bool sparse(uint32_t fid) const¶
Returns true if the fragment with the given index is sparse.
-
inline std::pair<uint64_t, uint64_t> timestamp_range(uint32_t fid) const¶
Returns the timestamp range of the fragment with the given index.
-
inline uint64_t cell_num(uint32_t fid) const¶
Returns the number of cells of the fragment with the given index.
-
inline uint64_t total_cell_num() const¶
Returns the total number of cells written in the loaded fragments.
-
inline uint32_t version(uint32_t fid) const¶
Returns the version of the fragment with the given index.
-
inline ArraySchema array_schema(uint32_t fid) const¶
Returns the array schema of the fragment with the given index.
-
inline std::string array_schema_name(uint32_t fid) const¶
Returns the array schema name of the fragment with the given index.
-
inline bool has_consolidated_metadata(uint32_t fid) const¶
Returns true if the fragment with the given index has consolidated metadata.
-
inline uint32_t unconsolidated_metadata_num() const¶
Returns the number of fragments with unconsolidated metadata.
-
inline uint32_t to_vacuum_num() const¶
Returns the number of fragments to vacuum.
-
inline std::string to_vacuum_uri(uint32_t fid) const¶
Returns the URI of the fragment to vacuum with the given index.
-
inline void dump(FILE *out = nullptr) const¶
Dumps the fragment info in an ASCII representation to an output.
- Parameters:
out – (Optional) File to dump output to. Defaults to
nullptr
which will lead to selection ofstdout
.
-
inline std::shared_ptr<tiledb_fragment_info_t> ptr() const¶
Returns the C TileDB context object.
-
inline void load() const¶
Experimental¶
-
class ArraySchemaEvolution¶
Evolve the schema on a tiledb::Array.
See examples for more usage details.
Example:
// Open the array for writing tiledb::Context ctx; tiledb::ArraySchemaEvolution evolution(ctx); evolution.drop_attribute("a1"); evolution.array_evolve("my_test_array");
Public Functions
-
inline ArraySchemaEvolution(const Context &context, tiledb_array_schema_evolution_t *evolution)¶
Constructs the array schema evolution with the input C array array schema evolution object.
- Parameters:
ctx – TileDB context
evolution – C API array schema evolution object
-
inline ArraySchemaEvolution(const Context &context)¶
Constructs an array schema evolution object.
- Parameters:
ctx – TileDB context
-
inline ArraySchemaEvolution &add_attribute(const Attribute &attr)¶
Adds an Attribute to the array schema evolution.
Example:
tiledb::Context ctx; tiledb::ArraySchemaEvolution schema_evolution(ctx); schema_evolution.add_attribute(Attribute::create<int32_t>(ctx, "attr_name"));
- Parameters:
attr – The Attribute to add
- Returns:
Reference to this
ArraySchemaEvolution
instance.
-
inline ArraySchemaEvolution &drop_attribute(const std::string &attribute_name)¶
Drops an attribute.
Example:
tiledb::Context ctx; tiledb::ArraySchemaEvolution schema_evolution(ctx); schema_evolution.drop_attribute("attr_name");
- Parameters:
attr – The attribute to be dropped
- Returns:
Reference to this
ArraySchemaEvolution
instance.
-
inline ArraySchemaEvolution &add_enumeration(const Enumeration &enmr)¶
Adds an Enumeration to the array schema evolution.
Example:
tiledb::Context ctx; tiledb::ArraySchemaEvolution schema_evolution(ctx); std::vector<std::string> values = {"red", "green", "blue"}; schema_evolution.add_enumeration(Enumeration::create(ctx, "an_enumeration", values));
- Parameters:
enmr – The Enumeration to add.
- Returns:
Reference to this
ArraySchemaEvolution
instance.
-
inline ArraySchemaEvolution &extend_enumeration(const Enumeration &enmr)¶
Extends an Enumeration during array schema evolution.
Example:
tiledb::Context ctx; tiledb::Enumeration old_enmr = array->get_enumeration("some_enumeration"); std::vector<std::string> new_values = {"cyan", "magenta", "mauve"}; tiledb::Enumeration new_enmr = old_enmr->extend(new_values); tiledb::ArraySchemaEvolution schema_evolution(ctx); schema_evolution.extend_enumeration(new_enmr);
- Parameters:
enmr – The Enumeration to extend.
- Returns:
Reference to this
ArraySchemaEvolution
instance.
-
inline ArraySchemaEvolution &drop_enumeration(const std::string &enumeration_name)¶
Drops an enumeration.
Example:
tiledb::Context ctx; tiledb::ArraySchemaEvolution schema_evolution(ctx); schema_evolution.drop_enumeration("enumeration_name");
- Parameters:
enumeration_name – The enumeration to be dropped
- Returns:
Reference to this
ArraySchemaEvolution
instance.
-
inline ArraySchemaEvolution &expand_current_domain(const CurrentDomain &expanded_domain)¶
Expands the current domain during array schema evolution. TileDB will enforce that the new current domain is expanding on the current one and not contracting during
tiledb_array_evolve
.- Parameters:
expanded_domain – The current domain we want to expand the schema to.
-
inline void set_timestamp_range(const std::pair<uint64_t, uint64_t> ×tamp_range)¶
Sets timestamp range.
Example:
tiledb::Context ctx; tiledb::ArraySchemaEvolution schema_evolution(ctx); uint64_t now = tiledb_timestamp_now_ms() schema_evolution.set_timestamp_range({now, now});
- Parameters:
timestamp_range – The timestamp range to be set
-
inline ArraySchemaEvolution &array_evolve(const std::string &array_uri)¶
Evolves the schema of an array.
Example:
tiledb::Context ctx; tiledb::ArraySchemaEvolution schema_evolution(ctx); schema_evolution.drop_attribute("attr_name"); schema_evolution.array_evolve("test_array_uri");
- Parameters:
array_uri – The uri of an array
- Returns:
Reference to this
ArraySchemaEvolution
instance.
-
inline std::shared_ptr<tiledb_array_schema_evolution_t> ptr() const¶
Returns a shared pointer to the C TileDB array schema evolution object.
-
inline ArraySchemaEvolution(const Context &context, tiledb_array_schema_evolution_t *evolution)¶
-
class Group¶
Public Functions
-
inline Group(const Context &ctx, const std::string &group_uri, tiledb_query_type_t query_type)¶
Constructor. Opens the group for the given query type. The destructor calls the
close()
method.Example:
// Open the group for reading tiledb::Context ctx; tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_READ);
- Parameters:
ctx – TileDB context.
group_uri – The group URI.
query_type – Query type to open the group for.
-
inline Group(const Context &ctx, const std::string &group_uri, tiledb_query_type_t query_type, const Config &config)¶
Constructor. Sets a config to the group and opens it for the given query type. The destructor calls the
close()
method.Example:
// Open the group for reading tiledb::Context ctx; tiledb::Config cfg; cfg["rest.username"] = "user"; cfg["rest.password"] = "pass"; tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_READ, cfg);
- Parameters:
ctx – TileDB context.
group_uri – The group URI.
query_type – Query type to open the group for.
config – COnfiguration parameters
-
inline void open(tiledb_query_type_t query_type)¶
Opens the group using a query type as input.
This is to indicate that queries created for this
Group
object will inherit the query type. In other words,Group
objects are opened to receive only one type of queries. They can always be closed and be re-opened with another query type. Also there may be many differentGroup
objects created and opened with different query types. For instance, one may create and open an group objectgroup_read
for reads and another onegroup_write
for writes, and interleave creation and submission of queries for both these group objects.Example:
// Open the group for writing tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_WRITE); // Close and open again for reading. group.close(); group.open(TILEDB_READ);
- Parameters:
query_type – The type of queries the group object will be receiving.
- Throws:
TileDBError – if the group is already open or other error occurred.
-
inline void set_config(const Config &config) const¶
Sets the group config.
- Pre:
The group must be closed.
-
inline void close(bool should_throw = true)¶
Closes the group. This must be called directly if you wish to check that any changes to the group were committed. This is automatically called by the destructor but any errors encountered are logged instead of throwing an exception from a destructor.
Example:
tiledb::Group group(ctx, "s3://bucket-name/group-name", TILEDB_READ); group.close();
-
inline bool is_open() const¶
Checks if the group is open.
-
inline std::string uri() const¶
Returns the group URI.
-
inline tiledb_query_type_t query_type() const¶
Returns the query type the group was opened with.
-
inline void put_metadata(const std::string &key, tiledb_datatype_t value_type, uint32_t value_num, const void *value)¶
Puts a metadata key-value item to an open group. The group must be opened in WRITE mode, otherwise the function will error out.
Note
The writes will take effect only upon closing the group.
- Parameters:
key – The key of the metadata item to be added. UTF-8 encodings are acceptable.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata.
value – The metadata value in binary form.
-
inline void delete_group(const std::string &uri, bool recursive = false)¶
Deletes all written data from an open group. The group must be opened in MODIFY_EXCLUSIVE mode, otherwise the function will error out.
Note
if recursive == false, data added to the group will be left as-is.
- Parameters:
uri – The address of the group item to be deleted.
recursive – True if all data inside the group is to be deleted.
- Post:
This is destructive; the group may not be reopened after delete.
-
inline void delete_metadata(const std::string &key)¶
Deletes a metadata key-value item from an open group. The group must be opened in WRITE mode, otherwise the function will error out.
Note
The writes will take effect only upon closing the group.
Note
If the key does not exist, this will take no effect (i.e., the function will not error out).
- Parameters:
key – The key of the metadata item to be deleted.
-
inline void get_metadata(const std::string &key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)¶
Gets a metadata key-value item from an open group. The group must be opened in READ mode, otherwise the function will error out.
Note
If the key does not exist, then
value
will be NULL.- Parameters:
key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.
value – The metadata value in binary form.
-
inline bool has_metadata(const std::string &key, tiledb_datatype_t *value_type)¶
Checks if key exists in metadata from an open group. The group must be opened in READ mode, otherwise the function will error out.
Note
If the key does not exist, then
value_type
will not be modified.- Parameters:
key – The key of the metadata item to be retrieved. UTF-8 encodings are acceptable.
value_type – The datatype of the value associated with the key (if any).
- Returns:
true if the key exists, else false.
-
inline uint64_t metadata_num() const¶
Returns then number of metadata items in an open group. The group must be opened in READ mode, otherwise the function will error out.
-
inline void get_metadata_from_index(uint64_t index, std::string *key, tiledb_datatype_t *value_type, uint32_t *value_num, const void **value)¶
Gets a metadata item from an open group using an index. The group must be opened in READ mode, otherwise the function will error out.
- Parameters:
index – The index used to get the metadata.
key – The metadata key.
value_type – The datatype of the value.
value_num – The value may consist of more than one items of the same datatype. This argument indicates the number of items in the value component of the metadata. Keys with empty values are indicated by value_num == 1 and value == NULL.
value – The metadata value in binary form.
-
inline void add_member(const std::string &uri, const bool &relative, std::optional<std::string> name = std::nullopt)¶
Add a member to a group
- Parameters:
uri – of member to add
relative – is the URI relative to the group location
-
inline void remove_member(const std::string &name_or_uri)¶
Remove a member from a group
- Parameters:
name_or_uri – Name or URI of member to remove. If the URI is registered multiple times in the group, the name needs to be specified so that the correct one can be removed. Note that if a URI is registered as both a named and unnamed member, the unnamed member will be removed successfully using the URI.
-
inline bool is_relative(std::string name) const¶
retrieve the relative attribute for a named member
- Parameters:
name – of member to retrieve associated relative indicator.
Public Static Functions
-
static inline void create(const tiledb::Context &ctx, const std::string &uri)¶
Create a TileDB Group
Example:
tiledb::Group::create(ctx, "s3://bucket-name/group-name");
- Parameters:
ctx – tiledb context
uri – URI where group will be created.
-
static inline void consolidate_metadata(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶
Consolidates the group metadata into a single group metadata file.
Example:
tiledb::Group::consolidate_metadata(ctx, "s3://bucket-name/group-name");
- Parameters:
ctx – TileDB context
uri – The URI of the TileDB group to be consolidated.
config – Configuration parameters for the consolidation.
-
static inline void vacuum_metadata(const Context &ctx, const std::string &uri, Config *const config = nullptr)¶
Cleans up the group metadata.
Example:
tiledb::Group::vacuum_metadata(ctx, "s3://bucket-name/group-name");
- Parameters:
ctx – TileDB context
uri – The URI of the TileDB group to vacuum.
config – Configuration parameters for the vacuuming.
-
inline Group(const Context &ctx, const std::string &group_uri, tiledb_query_type_t query_type)¶