Accessing nested HDF5 file from http via kerchunk

Thanks @rabernat, your tweet was exactly what inspired me to try this out :smiley: The dataset I’m working with is in a HDF5 format though, and I’m trying to tackle 1) The nested HDF5 structure which requires xarray-datatree to read, and 2) Getting kerchunk to work so that we could read the nested HDF5 file using engine="zarr".

Going beyond that, I’m hoping that there’s a way to use kvikIO to read those HDF5 files (via kerchunk/Zarr) directly into GPU memory (via NVIDIA GPU Direct Storage) as mentioned in Favorite way to go from netCDF (&xarray) to torch/TF/Jax et al - #8 by weiji14, but that’s getting ahead of myself a little bit. This would be a nice for [use case demonstration] Kvikio Direct-to-gpu -> xarray -> xbatcher -> ml model · Issue #87 · xarray-contrib/xbatcher · GitHub though!

1 Like