Using TileDB Statistics

A lot of performance optimization for TileDB programs involves minimizing wasted work. TileDB comes with an internal statistics reporting system that can help identify potential areas of performance improvement for your TileDB programs, including reducing wasted work.

Full programs

Program

Links

using_tiledb_stats

statscpp statspy

The TileDB statistics can be enabled and disabled at runtime, and a report can be dumped at any point. A typical situation is to enable the statistics immediately before submitting a query, submit the query, and then immediately dump the report. This can be done like so:

C++

tiledb::Stats::enable();
query.submit();
tiledb::Stats::dump(stdout);
tiledb::Stats::disable();

Python

tiledb.stats_enable()
data = A[:]
tiledb.stats_dump()
tiledb.stats_disable()

With the dump call, a report containing the gathered statistics will be printed. The report prints values of many individual counters, followed by a high-level summary printed at the end of the report. Typically the summary contains the necessary information to make high-level performance tuning decisions. An example summary is shown below:

Summary:
--------
Hardware concurrency: 4
Reads:
  Read query submits: 1
  Tile cache hit ratio: 0 / 1  (0.0%)
  Fixed-length tile data copy-to-read ratio: 4 / 1000 bytes (0.0%)
  Var-length tile data copy-to-read ratio: 0 / 0 bytes
  Total tile data copy-to-read ratio: 4 / 1000 bytes (0.0%)
  Read compression ratio: 1245 / 1274 bytes (1.0x)
Writes:
  Write query submits: 0
  Tiles written: 0
  Write compression ratio: 0 / 0 bytes

Each item is explained separately below.

Hardware concurrency

The amount of available hardware-level concurrency (cores plus hardware threads).

Read query submits

The number of times a read query submit call was made.

Tile cache hit ratio

Ratio expressing utilization of the tile cache. The denominator indicates how many tiles in total were fetched from disk to satisfy a read query. After a tile is fetched once, it is placed in the internal tile cache; the numerator indicates how many tiles were requested in order to satisfy a read query that were hits in the tile cache. In the above example summary, a single tile was fetched (which missed the cache because it was the first access to that tile). If the same tile was accessed again for a subsequent query, it could hit in the cache, increasing the ratio 1/2. Higher ratios are better.

Fixed-length tile data copy-to-read ratio

Ratio expressing a measurement of wasted work for reads of fixed-length attributes. The denominator is the total number of (uncompressed) bytes of fixed-length attribute data fetched from disk. The numerator is the number of those bytes that were actually copied into the query buffers to satisfy a read query. In the above example, 1000 bytes of fixed-length data were read from disk and only 4 of those bytes were used to satisfy a read query, indicating a large amount of wasted work. Higher ratios are better.

Var-length tile data copy-to-read ratio

This ratio is the same as the previous, but for variable-length attribute data. This is tracked separately because variable-length attributes can often be read performance bottlenecks, as their size is by nature unpredictable. Higher ratios are better.

Total tile data copy-to-read ratio

The overall copy-to-read ratio, where the numerator is the sum of the fixed- and variable-length numerators, and the denominator is the sum of the fixed- and variable-length denominators.

Read compression ratio

Ratio expressing the effective compression ratio of data read from disk. The numerator is the total number of bytes returned from disk reads after filtering. The denominator is the total number of bytes read from disk, whether filtered or not. This is different than the tile copy-to-read ratios due to extra read operations for array and fragment metadata. For simplicity, this ratio currently counts all filters as “compressors”, so the ratio may not be exactly the compression ratio in the case that other filters besides compressors are involved.

Write query submits

The number of times a write query submit call was made.

Tiles written

The total number of tiles written, over all write queries.

Write compression ratio

Ratio expressing the effective compression ratio of data written to disk. The numerator is the total number of un-filtered bytes requested to be written to disk. The denominator is the total number of bytes written from disk, after filtering. Similarly to the read compression ratio, this value counts all filters as compressors.

Note

The TileDB library is built by default with statistics enabled. You can disable statistics gathering with the -DTILEDB_STATS=OFF CMake variable.

Read example

As a simple example, we will examine the effect of different dense array tiling configurations on read queries. Intuitively, the closer the tile shape aligns with the shape of read queries to the array, the higher performance we expect to see. This is because TileDB fetches from storage only tiles overlapping with the subarray query. The bigger the overlap between the subarray query and the tile(s), the less the wasted work. We will use the statistics report to gain an exact understanding of the wasted work.

For each of the following experiments, we will read from a 2D dense array containing about 0.5GB of data. The array will be 12,000 rows by 12,000 columns with a single uncompressed int32 attribute. The array is created as follows:

C++

Context ctx;
ArraySchema schema(ctx, TILEDB_DENSE);
Domain dom(ctx);
dom.add_dimension(
       Dimension::create<uint32_t>(ctx, "row", {{1, 12000}}, row_tile_extent))
    .add_dimension(
        Dimension::create<uint32_t>(ctx, "col", {{1, 12000}}, col_tile_extent));
schema.set_domain(dom);
schema.add_attribute(
    Attribute::create<int32_t>(ctx, "a", {TILEDB_NO_COMPRESSION, -1}));
Array::create(array_uri, schema);

Python

ctx = tiledb.Ctx()
dom = tiledb.Domain(ctx,
                   tiledb.Dim(ctx, name="rows", domain=(1, 12000), tile=row_tile_extent, dtype=np.int32),
                   tiledb.Dim(ctx, name="cols", domain=(1, 12000), tile=col_tile_extent, dtype=np.int32))

schema = tiledb.ArraySchema(ctx, domain=dom, sparse=False,
                           attrs=[tiledb.Attr(ctx, name="a", dtype=np.int32)])

# Create the (empty) array on disk.
tiledb.DenseArray.create(array_name, schema)

The total array size on disk then is 12000 * 12000 * 4 bytes, about 550 MB.

As a first example, suppose we configured the schema such that the array is composed of a single tile, i.e.:

C++

dom.add_dimension(
      Dimension::create<uint32_t>(ctx, "row", {{1, 12000}}, 12000))
   .add_dimension(
       Dimension::create<uint32_t>(ctx, "col", {{1, 12000}}, 12000));

Python

dom = tiledb.Domain(ctx,
                   tiledb.Dim(ctx, name="rows", domain=(1, 12000), tile=12000, dtype=np.int32),
                   tiledb.Dim(ctx, name="cols", domain=(1, 12000), tile=12000, dtype=np.int32))

With this array schema, the entire array is composed of a single tile. Thus, any read query (regardless of the subarray) will fetch the entire array from disk. We will issue a read query of the first 3,000 rows (subarray [1:3000, 1:12000]) which is 25% of the cells in the array:

C++

std::vector<uint32_t> subarray = {1, 3000, 1, 12000};
Context ctx;
Array array(ctx, array_name, TILEDB_READ);
Query query(ctx, array);
std::vector<int32_t> values(
    array.max_buffer_elements(subarray)["a"].second);
query.set_subarray(subarray).set_buffer("a", values);

Stats::enable();
query.submit();
Stats::dump(stdout);
Stats::disable();

Python

ctx = tiledb.Ctx()
with tiledb.DenseArray(ctx, array_name, mode='r') as A:
    # Read a slice of 3,000 rows.
    tiledb.stats_enable()
    data = A[1:3001, 1:12001]
    tiledb.stats_dump()
    tiledb.stats_disable()

The report printed for this experiment is:

Summary:
--------
Hardware concurrency: 4
Reads:
  Read query submits: 1
  Tile cache hit ratio: 0 / 1  (0.0%)
  Fixed-length tile data copy-to-read ratio: 144000000 / 576105488 bytes (25.0%)
  Var-length tile data copy-to-read ratio: 0 / 0 bytes
  Total tile data copy-to-read ratio: 144000000 / 576105488 bytes (25.0%)
  Read compression ratio: 576000000 / 576105488 bytes (1.0x)
Writes:
  Write query submits: 0
  Tiles written: 0
  Write compression ratio: 0 / 0 bytes

We can see that during the time the statistics were being gathered, there was a single read query submitted (our read query). The denominator of the tile cache hit ratio tells us that the single read query accessed a single tile, as expected (since the entire array is a single tile).

The “fixed-length tile data copy-to-read ratio” metric expresses the “wasted work” measurement, namely the number of bytes copied into our query buffers to fulfill the read query, divided by the number of bytes read from disk. In this experiment, 144,000,000 bytes (the int32_t fixed-length attribute values for the subarray [1:3000, 1:12000]) were copied to the query buffers, but we read 576,000,000 tile data bytes from disk (576,000,000 = 12000 * 12000 * 4 bytes). This copy-to-read ratio tells us 25% of the work done by TileDB to satisfy the read query was useful.

Now let’s modify the array such that each row corresponds to a single tile, i.e.:

C++

dom.add_dimension(
      Dimension::create<uint32_t>(ctx, "row", {{1, 12000}}, 1))
   .add_dimension(
       Dimension::create<uint32_t>(ctx, "col", {{1, 12000}}, 12000));

Python

dom = tiledb.Domain(ctx,
                   tiledb.Dim(ctx, name="rows", domain=(1, 12000), tile=1, dtype=np.int32),
                   tiledb.Dim(ctx, name="cols", domain=(1, 12000), tile=12000, dtype=np.int32))

When reading the subarray [1:3000, 1:12000] as in the previous experiment, we see the following statistics:

Reads:
  Read query submits: 1
  Tile cache hit ratio: 0 / 3000  (0.0%)
  Fixed-length tile data copy-to-read ratio: 144000000 / 144060000 bytes (100.0%)
  Var-length tile data copy-to-read ratio: 0 / 0 bytes
  Total tile data copy-to-read ratio: 144000000 / 144060000 bytes (100.0%)
  Read compression ratio: 144000000 / 144060000 bytes (1.0x)

Now the denominator of the tile cache hit ratio tells us that 3,000 tiles were accessed, which is as expected because we requested 3,000 rows. Note also the difference in the copy-to-read ratio. We still copy 144,000,000 bytes (since the query is the same), but the amount of data is reduced from the entire array to only the tiles (rows) required, which is 12000 * 3000 * 4 = 144,000,000 bytes. This yields a 100% useful work (no wasted work) metric.

You may notice the “read compression ratio” metric reports more bytes read and used than just the tile data. The difference is accounted for by the array and fragment metadata, which TileDB must also read in order to determine which tiles should be read and decompressed, and the serialization overhead introduced by the TileDB filter pipeline.

Finally, we will issue two overlapping queries back-to-back, first the same [1:3000, 1:12000] subarray followed by subarray [2000:4000, 1:12000], i.e.:

C++

Context ctx;
std::vector<uint32_t> subarray1 = {1, 3000, 1, 12000},
                      subarray2 = {2000, 4000, 1, 12000};
Array array(ctx, array_name, TILEDB_READ);
std::vector<int32_t> values1(array.max_buffer_elements(subarray1)["a"].second),
                     values2(array.max_buffer_elements(subarray2)["a"].second);
Query query1(ctx, array), query2(ctx, array);
query1.set_subarray(subarray1).set_buffer("a", values1);
query2.set_subarray(subarray2).set_buffer("a", values2);

Stats::enable();
query1.submit();
query2.submit();
Stats::dump(stdout);
Stats::disable();

Python

ctx = tiledb.Ctx()
with tiledb.DenseArray(ctx, array_name, mode='r') as A:
    tiledb.stats_enable()
    data1 = A[1:3001, 1:12001]
    data2 = A[2000:4001, 1:12001]
    tiledb.stats_dump()
    tiledb.stats_disable()

This yields the following report:

Reads:
  Read query submits: 2
  Tile cache hit ratio: 198 / 5001  (4.0%)
  Fixed-length tile data copy-to-read ratio: 240048000 / 230640060 bytes (104.1%)
  Var-length tile data copy-to-read ratio: 0 / 0 bytes
  Total tile data copy-to-read ratio: 240048000 / 230640060 bytes (104.1%)
  Read compression ratio: 230544000 / 230640060 bytes (1.0x)

Several things have changed, most notably now there were several hits in the tile cache out of the 5,001 tiles accessed. However, we may have expected that 1,001 tiles would hit in the cache, since the two queries overlapped on rows 2000–3000 (inclusive). The reason we do not see this in the statistics is that the default tile cache configuration does not allow many tiles to be cached. Let’s increase the tile cache size to 100MB and repeat the experiment:

C++

Config config;
config["sm.tile_cache_size"] = 100 * 1024 * 1024;  // 100 MB
Context ctx(config);
std::vector<uint32_t> subarray1 = {1, 3000, 1, 12000},
                      subarray2 = {2000, 4000, 1, 12000};
Array array(ctx, array_name, TILEDB_READ);
std::vector<int32_t> values1(array.max_buffer_elements(subarray1)["a"].second),
                     values2(array.max_buffer_elements(subarray2)["a"].second);
Query query1(ctx, array), query2(ctx, array);
query1.set_subarray(subarray1).set_buffer("a", values1);
query2.set_subarray(subarray2).set_buffer("a", values2);

Stats::enable();
query1.submit();
query2.submit();
Stats::dump(stdout);
Stats::disable();

Python

ctx = tiledb.Ctx({'sm.tile_cache_size': 100 * 1024 * 1024})
with tiledb.DenseArray(ctx, array_name, mode='r') as A:
    tiledb.stats_enable()
    data1 = A[1:3001, 1:12001]
    data2 = A[2000:4001, 1:12001]
    tiledb.stats_dump()
    tiledb.stats_disable()

The stats summary now reads:

Reads:
  Read query submits: 2
  Tile cache hit ratio: 697 / 5001  (13.9%)
  Fixed-length tile data copy-to-read ratio: 240048000 / 206678080 bytes (116.1%)
  Var-length tile data copy-to-read ratio: 0 / 0 bytes
  Total tile data copy-to-read ratio: 240048000 / 206678080 bytes (116.1%)
  Read compression ratio: 206592000 / 206678080 bytes (1.0x)

We now have more hits in the cache. Also notice that the copy-to-read ratio now exceeds 100%, because although the same number of bytes were copied into the query buffers, many of those bytes did not have to be read from disk twice (as they were hits in the cache).

Write example

The write statistics summary is less in-depth compared to the read summary. We will take a look at the example of writing the above 12,000 by 12,000 array with synthetic attribute data when each row is a single tile:

C++

Array array(ctx, array_name, TILEDB_WRITE);
Query query(ctx, array);
std::vector<int32_t> values(12000 * 12000);
for (unsigned i = 0; i < values.size(); i++) {
  values[i] = i;
}
query.set_layout(TILEDB_ROW_MAJOR).set_buffer("a", values);
Stats::enable();
query.submit();
Stats::dump(stdout);
Stats::disable();

Python

ctx = tiledb.Ctx()
# Open the array and write to it.
with tiledb.DenseArray(ctx, array_name, mode='w') as A:
    data = np.arange(12000 * 12000)
    tiledb.stats_enable()
    A[:] = data
    tiledb.stats_dump()
    tiledb.stats_disable()

With attribute a uncompressed as before, this gives the following report in the summary:

Writes:
  Write query submits: 1
  Tiles written: 12000
  Write compression ratio: 576384160 / 576284005 bytes (1.0x)

As expected, because each row was a single tile, writing 12,000 rows causes 12,000 tiles to be written. Because a is uncompressed, the compression ratio is nearly exactly 1.0x (the small amount of difference is due to the array and fragment metadata being compressed independently of the attribute).

If we enable compression on the a attribute when creating the array schema, e.g. bzip2 at its default compression level, we see the change in the report:

Writes:
  Write query submits: 1
  Tiles written: 12000
  Write compression ratio: 576384160 / 53136720 bytes (10.8x)

Because our synthetic array data is very predictable, bzip2 does a good job compressing it, and this is reflected in the reported compression ratio.

Full statistics report

In general, the summary report may be enough to reveal potential sources of large performance flaws. In addition, accompanying every stats dump is a list of all of the individual internal performance counters that TileDB tracks. Each of the counter names is prefixed with the system it measures, e.g. vfs_* counters measure details of the TileDB VFS system, compressor_* measures details of the various compressors, etc. Some of these counters are self-explanatory, and others are intended primarily for TileDB developers to diagnose internal performance issues.

Example full statistics report

===================================== TileDB Statistics Report =======================================

Individual function statistics:
  Function name                                                          # calls       Total time (ns)
  ----------------------------------------------------------------------------------------------------
  compressor_bzip_compress,                                                12000,         63560889145
  compressor_bzip_decompress,                                                  0,                   0
  compressor_dd_compress,                                                      0,                   0
  compressor_dd_decompress,                                                    0,                   0
  compressor_gzip_compress,                                                    6,             2988746
  compressor_gzip_decompress,                                                  0,                   0
  compressor_lz4_compress,                                                     0,                   0
  compressor_lz4_decompress,                                                   0,                   0
  compressor_rle_compress,                                                     0,                   0
  compressor_rle_decompress,                                                   0,                   0
  compressor_zstd_compress,                                                    0,                   0
  compressor_zstd_decompress,                                                  0,                   0
  encryption_encrypt_aes256gcm,                                                0,                   0
  encryption_decrypt_aes256gcm,                                                0,                   0
  filter_pipeline_run_forward,                                             12001,         63850960757
  filter_pipeline_run_reverse,                                                 0,                   0
  cache_lru_evict,                                                             0,                   0
  cache_lru_insert,                                                            0,                   0
  cache_lru_invalidate,                                                        0,                   0
  cache_lru_read,                                                              0,                   0
  cache_lru_read_partial,                                                      0,                   0
  reader_compute_cell_ranges,                                                  0,                   0
  reader_compute_dense_cell_ranges,                                            0,                   0
  reader_compute_dense_overlapping_tiles_and_cell_ranges,                      0,                   0
  reader_compute_overlapping_coords,                                           0,                   0
  reader_compute_overlapping_tiles,                                            0,                   0
  reader_compute_tile_coords,                                                  0,                   0
  reader_copy_fixed_cells,                                                     0,                   0
  reader_copy_var_cells,                                                       0,                   0
  reader_dedup_coords,                                                         0,                   0
  reader_dense_read,                                                           0,                   0
  reader_fill_coords,                                                          0,                   0
  reader_filter_tiles,                                                         0,                   0
  reader_init_tile_fragment_dense_cell_range_iters,                            0,                   0
  reader_next_subarray_partition,                                              0,                   0
  reader_read,                                                                 0,                   0
  reader_read_all_tiles,                                                       0,                   0
  reader_sort_coords,                                                          0,                   0
  reader_sparse_read,                                                          0,                   0
  writer_check_coord_dups,                                                     0,                   0
  writer_check_coord_dups_global,                                              0,                   0
  writer_check_global_order,                                                   0,                   0
  writer_compute_coord_dups,                                                   0,                   0
  writer_compute_coord_dups_global,                                            0,                   0
  writer_compute_coords_metadata,                                              0,                   0
  writer_compute_write_cell_ranges,                                        12000,            44097834
  writer_create_fragment,                                                      1,              621921
  writer_filter_tiles,                                                         1,         63885761123
  writer_global_write,                                                         0,                   0
  writer_init_global_write_state,                                              0,                   0
  writer_init_tile_dense_cell_range_iters,                                     1,            14082371
  writer_ordered_write,                                                        1,         66025258154
  writer_prepare_full_tiles_fixed,                                             0,                   0
  writer_prepare_full_tiles_var,                                               0,                   0
  writer_prepare_tiles_fixed,                                                  0,                   0
  writer_prepare_tiles_ordered,                                                1,           403377491
  writer_prepare_tiles_var,                                                    0,                   0
  writer_sort_coords,                                                          0,                   0
  writer_unordered_write,                                                      0,                   0
  writer_write,                                                                1,         66025267985
  writer_write_all_tiles,                                                      1,          1565860616
  sm_array_close_for_reads,                                                    0,                   0
  sm_array_close_for_writes,                                                   0,                   0
  sm_array_open_for_reads,                                                     0,                   0
  sm_array_open_for_writes,                                                    0,                   0
  sm_array_reopen,                                                             0,                   0
  sm_read_from_cache,                                                          0,                   0
  sm_write_to_cache,                                                           0,                   0
  sm_query_submit,                                                             1,         66025270927
  tileio_is_generic_tile,                                                      0,                   0
  tileio_read_generic,                                                         0,                   0
  tileio_write_generic,                                                        1,             1671328
  vfs_abs_path,                                                                4,              201980
  vfs_close_file,                                                              2,              104927
  vfs_constructor,                                                             0,                   0
  vfs_create_bucket,                                                           0,                   0
  vfs_create_dir,                                                              1,               94723
  vfs_create_file,                                                             0,                   0
  vfs_dir_size,                                                                0,                   0
  vfs_empty_bucket,                                                            0,                   0
  vfs_file_size,                                                               0,                   0
  vfs_filelock_lock,                                                           0,                   0
  vfs_filelock_unlock,                                                         0,                   0
  vfs_init,                                                                    0,                   0
  vfs_is_bucket,                                                               0,                   0
  vfs_is_dir,                                                                  2,               48002
  vfs_is_empty_bucket,                                                         0,                   0
  vfs_is_file,                                                                 0,                   0
  vfs_ls,                                                                      0,                   0
  vfs_move_file,                                                               0,                   0
  vfs_move_dir,                                                                0,                   0
  vfs_open_file,                                                               0,                   0
  vfs_read,                                                                    0,                   0
  vfs_read_all,                                                                0,                   0
  vfs_remove_bucket,                                                           0,                   0
  vfs_remove_file,                                                             0,                   0
  vfs_remove_dir,                                                              0,                   0
  vfs_supports_fs,                                                             0,                   0
  vfs_sync,                                                                    0,                   0
  vfs_write,                                                               12002,          1553079894
  vfs_s3_fill_file_buffer,                                                     0,                   0
  vfs_s3_write_multipart,                                                      0,                   0

Individual counter statistics:
  Counter name                                                             Value
  ------------------------------------------------------------------------------
  cache_lru_inserts,                                                           0
  cache_lru_read_hits,                                                         0
  cache_lru_read_misses,                                                       0
  reader_attr_tile_cache_hits,                                                 0
  reader_num_attr_tiles_touched,                                               0
  reader_num_bytes_after_filtering,                                            0
  reader_num_fixed_cell_bytes_copied,                                          0
  reader_num_fixed_cell_bytes_read,                                            0
  reader_num_tile_bytes_read,                                                  0
  reader_num_var_cell_bytes_copied,                                            0
  reader_num_var_cell_bytes_read,                                              0
  writer_num_attr_tiles_written,                                           12000
  writer_num_bytes_before_filtering,                                   576000000
  writer_num_bytes_written,                                             53101395
  sm_contexts_created,                                                         0
  sm_query_submit_layout_col_major,                                            0
  sm_query_submit_layout_row_major,                                            1
  sm_query_submit_layout_global_order,                                         0
  sm_query_submit_layout_unordered,                                            0
  sm_query_submit_read,                                                        0
  sm_query_submit_write,                                                       1
  tileio_read_num_bytes_read,                                                  0
  tileio_read_num_resulting_bytes,                                             0
  tileio_write_num_bytes_written,                                          35325
  tileio_write_num_input_bytes,                                           384160
  vfs_read_total_bytes,                                                        0
  vfs_write_total_bytes,                                                53136720
  vfs_read_num_parallelized,                                                   0
  vfs_read_all_total_regions,                                                  0
  vfs_posix_write_num_parallelized,                                            0
  vfs_win32_write_num_parallelized,                                            0
  vfs_s3_num_parts_written,                                                    0
  vfs_s3_write_num_parallelized,                                               0

The “function statistics” report the number of calls and amount of time in nanoseconds for each instrumented function. It is important to note that the time reported for these counters is aggregated across all threads. For example, if 10 threads invoke vfs_write and each thread’s call takes 100 ns, then the reported time for vfs_write will be 1000 ns, even though the average time was much less.

The “counter statistics” report the values of individual counters. The summary statistics are directly derived from these counter statistics.