Cloud array storage solutions

It’s been a while since the last reply but I think it’s a very interesting topic.

Things have changed on the TileDB front. The company has published TileDB-CF-Py which adds support for Xarray and NetCDF. I’ve been playing around with it for a few days and the experience of using it is good enough despite being under initial development.

I’m trying to build a cloud-based geodata store that interfaces with the xarray data model. The current implementation uses Open Data Cube and I’m not very happy with its performance, architecture, support, and cost (it requires a database and maintaining it can get very very expensive, especially for large, PB-scale datasets). However, its API is good and supports several querying methods, including passing geometries. I’m trying to replace that with a TileDB instance and a custom Python API or something with Zarr. I’m leaning towards TileDB because Zarr requires fsspec, the performance of which is abysmal when it comes to S3. You can read my other thread here.

I was wondering if anybody has some feedback