Object Management¶
In this tutorial you will learn about how you can organize your TileDB arrays and key-value stores hierarchically into TileDB groups. We call a TileDB array, key-value store or group as TileDB object. We also discuss some auxiliary TileDB object management functions.
Warning
This TileDB feature is experimental. Everything covered here works great, but the APIs might undergo changes in future versions.
Program |
Links |
|
TileDB groups¶
TileDB allows you to hierarchically organize your arrays and key-value stores
in groups. A group is practically a directory with a special (empty) TileDB
file __tiledb_group.tdb
. This offers an intuitive and familiar way
to store your various TileDB objects in persistent storage. You can create
a group simply as follows:
C++
tiledb::Context ctx;
tiledb::create_group(ctx, "my_group");
Python
tiledb.group_create("my_group")
Listing the my_group
directory, you get the following:
$ ls -l my_group
total 0
-rwx------ 1 stavros staff 0 Jul 3 10:08 __tiledb_group.tdb
Note that you can hierarchically organize TileDB groups similar to your filesystem directories, i.e., groups can be arbitrarily nested in other groups.
Getting the object type¶
TileDB also allows you to check the object type as follows. If path
does not exist or is not a TileDB array, key-value store or group, it is
marked as “invalid”.
C++
tiledb::Context ctx;
auto obj_type = Object::object(ctx, path).type();
Python
type = tiledb.object_type(path)
Listing the object hierarchy¶
TileDB offers various ways to list the contents of a group, even recursively in pre-order or post-order traversal, optionally passing a special callback function that will be invoked for every visited object. This is demonstrated in the code snippet below:
C++
tiledb::Context ctx;
// List children
std::cout << "\nListing hierarchy: \n";
tiledb::ObjectIter obj_iter(ctx, path);
for (const auto& object : obj_iter)
print_path(object.uri(), object.type());
// Walk in a path with a pre- and post-order traversal
std::cout << "\nPreorder traversal: \n";
obj_iter.set_recursive(); // Default order is preorder
for (const auto& object : obj_iter)
print_path(object.uri(), object.type());
std::cout << "\nPostorder traversal: \n";
obj_iter.set_recursive(TILEDB_POSTORDER);
for (const auto& object : obj_iter)
print_path(object.uri(), object.type());
where the print_path
callback takes as input a string path and an object
type argument. This is how we defined it in our code example:
void print_path(const std::string& path, tiledb::Object::Type type) {
// Simply print the path and type
std::cout << path << " ";
switch (type) {
case tiledb::Object::Type::Array:
std::cout << "ARRAY";
break;
case tiledb::Object::Type::KeyValue:
std::cout << "KEY_VALUE";
break;
case tiledb::Object::Type::Group:
std::cout << "GROUP";
break;
default:
std::cout << "INVALID";
}
std::cout << "\n";
}
Python
# List children
print("\nListing hierarchy:")
tiledb.ls(path, lambda obj_path, obj_type: print(obj_path, obj_type))
# Walk in a path with a pre- and post-order traversal
print("\nPreorder traversal:")
tiledb.walk(path, lambda obj_path, obj_type: print(obj_path, obj_type)) # Default order is preorder
print("\nPostorder traversal:")
tiledb.walk(path, lambda obj_path, obj_type: print(obj_path, obj_type), "postorder")
In the object
code example, we initially create the following hierarchy:
my_group/
├── dense_arrays
│ ├── array_A
│ ├── array_B
│ └── kv
└── sparse_arrays
├── array_C
└── array_D
The code snippet we provided above would print out the following for this
hierarchy (where <cwd>
is the full path of your current working
directory):
C++
Listing hierarchy:
file://<cwd>/my_group/dense_arrays GROUP
file://<cwd>/my_group/sparse_arrays GROUP
Preorder traversal:
file://<cwd>/my_group/dense_arrays GROUP
file://<cwd>/my_group/dense_arrays/array_A ARRAY
file://<cwd>/my_group/dense_arrays/array_B ARRAY
file://<cwd>/my_group/dense_arrays/kv KEY_VALUE
file://<cwd>/my_group/sparse_arrays GROUP
file://<cwd>/my_group/sparse_arrays/array_C ARRAY
file://<cwd>/my_group/sparse_arrays/array_D ARRAY
Postorder traversal:
file://<cwd>/my_group/dense_arrays/array_A ARRAY
file://<cwd>/my_group/dense_arrays/array_B ARRAY
file://<cwd>/my_group/dense_arrays/kv KEY_VALUE
file://<cwd>/my_group/dense_arrays GROUP
file://<cwd>/my_group/sparse_arrays/array_C ARRAY
file://<cwd>/my_group/sparse_arrays/array_D ARRAY
file://<cwd>/my_group/sparse_arrays GROUP
Python
Listing hierarchy:
file://<cwd>/my_group/dense_arrays group
file://<cwd>/my_group/sparse_arrays group
Preorder traversal:
file://<cwd>/my_group/dense_arrays group
file://<cwd>/my_group/dense_arrays/array_A array
file://<cwd>/my_group/dense_arrays/array_B array
file://<cwd>/my_group/dense_arrays/kv kv
file://<cwd>/my_group/sparse_arrays group
file://<cwd>/my_group/sparse_arrays/array_C array
file://<cwd>/my_group/sparse_arrays/array_D array
Postorder traversal:
file://<cwd>/my_group/dense_arrays/array_A array
file://<cwd>/my_group/dense_arrays/array_B array
file://<cwd>/my_group/dense_arrays/kv kv
file://<cwd>/my_group/dense_arrays group
file://<cwd>/my_group/sparse_arrays/array_C array
file://<cwd>/my_group/sparse_arrays/array_D array
file://<cwd>/my_group/sparse_arrays group
Move/Remove objects¶
TileDB offers functions for renaming and removing TileDB objects. Note that these functions are “safe”, in the sense that they will not have any effect on “invalid” (i.e., non-TileDB) objects.
You can rename TileDB objects as follows:
C++
tiledb::Object::move(ctx, "my_group", "my_group_2");
Python
tiledb.move("my_group", "my_group_2")
Note
Moving TileDB objects across different storage backends (e.g., from S3 to local storage, or vice-versa) is currently not supported. However, it will be added in a future version.
You can remove TileDB objects as follows:
C++
tiledb::Object::remove(ctx, "my_group_2/dense_arrays");
Python
tiledb.remove("my_group_2/dense_arrays")
Running the object
code example, we get the
output shown below. Observe the listing after my_group
got
renamed to my_group_2
and my_group_2/dense_arrays
,
my_group_2/sparse_arrays/array_C
got removed.
C++
$ g++ -std=c++11 object.cc -o object_cpp -ltiledb
$ ./object_cpp
Listing hierarchy:
file://<cwd>/my_group/dense_arrays GROUP
file://<cwd>/my_group/sparse_arrays GROUP
Preorder traversal:
file://<cwd>/my_group/dense_arrays GROUP
file://<cwd>/my_group/dense_arrays/array_A ARRAY
file://<cwd>/my_group/dense_arrays/array_B ARRAY
file://<cwd>/my_group/dense_arrays/kv KEY_VALUE
file://<cwd>/my_group/sparse_arrays GROUP
file://<cwd>/my_group/sparse_arrays/array_C ARRAY
file://<cwd>/my_group/sparse_arrays/array_D ARRAY
Postorder traversal:
file://<cwd>/my_group/dense_arrays/array_A ARRAY
file://<cwd>/my_group/dense_arrays/array_B ARRAY
file://<cwd>/my_group/dense_arrays/kv KEY_VALUE
file://<cwd>/my_group/dense_arrays GROUP
file://<cwd>/my_group/sparse_arrays/array_C ARRAY
file://<cwd>/my_group/sparse_arrays/array_D ARRAY
file://<cwd>/my_group/sparse_arrays GROUP
Listing hierarchy:
file://<cwd>/my_group_2/sparse_arrays GROUP
Preorder traversal:
file://<cwd>/my_group_2/sparse_arrays GROUP
file://<cwd>/my_group_2/sparse_arrays/array_D ARRAY
Postorder traversal:
file://<cwd>/my_group_2/sparse_arrays/array_D ARRAY
file://<cwd>/my_group_2/sparse_arrays GROUP
Python
$ python object.py
Listing hierarchy:
file://<cwd>/my_group/dense_arrays group
file://<cwd>/my_group/sparse_arrays group
Preorder traversal:
file://<cwd>/my_group/dense_arrays group
file://<cwd>/my_group/dense_arrays/array_A array
file://<cwd>/my_group/dense_arrays/array_B array
file://<cwd>/my_group/dense_arrays/kv kv
file://<cwd>/my_group/sparse_arrays group
file://<cwd>/my_group/sparse_arrays/array_C array
file://<cwd>/my_group/sparse_arrays/array_D array
Postorder traversal:
file://<cwd>/my_group/dense_arrays/array_A array
file://<cwd>/my_group/dense_arrays/array_B array
file://<cwd>/my_group/dense_arrays/kv kv
file://<cwd>/my_group/dense_arrays group
file://<cwd>/my_group/sparse_arrays/array_C array
file://<cwd>/my_group/sparse_arrays/array_D array
file://<cwd>/my_group/sparse_arrays group
Listing hierarchy:
file://<cwd>/my_group_2/sparse_arrays group
Preorder traversal:
file://<cwd>/my_group_2/sparse_arrays group
file://<cwd>/my_group_2/sparse_arrays/array_D array
Postorder traversal:
file://<cwd>/my_group_2/sparse_arrays/array_D array
file://<cwd>/my_group_2/sparse_arrays group
$ ls -l my_group_2/
total 0
-rwx------ 1 stavros staff 0 Jul 3 11:18 __tiledb_group.tdb
drwx------ 4 stavros staff 136 Jul 3 11:18 sparse_arrays
$ ls -l my_group_2/sparse_arrays/
total 0
-rwx------ 1 stavros staff 0 Jul 3 11:18 __tiledb_group.tdb
drwx------ 4 stavros staff 136 Jul 3 11:18 array_D
$ ls -l my_group_2/sparse_arrays/array_D/
total 8
-rwx------ 1 stavros staff 115 Jul 3 11:18 __array_schema.tdb
-rwx------ 1 stavros staff 0 Jul 3 11:18 __lock.tdb