Puzzling S3 xarray.open_zarr latency

Xarray loads coordinates eagerly.
If you do the following, you will see the calls that s3fs is actually making:

import s3fs
s3fs.core.setup_logging("DEBUG")

You will see that the first call required 9 files to be downloaded, but the second version required MANY, hundreds of keys in “swe_run_a-geo.zarr/time”.

How did you create this data? Ideally, coordinates should not be chunked at all!

4 Likes