Extremely slow rechunking of Zarr store with xarray

mliukis · October 13, 2021, 10:48pm

@rabernat Would it be helpful to re-chunk the whole dataset after I open the Zarr store to make more efficient access to the time series for each (x, y) pair of coordinates? Or do I have to write newly rechunked data to the disk before accessing the data to take advantage of the re-chunking?
This is my first encounter with Zarr chunking so I apologize if it’s a trivial question.

A number of other questions that I have:

I am not very clear on how to determine an optimal chunking for the purpose of data processing when chunking exists in t, x and y dimensions for original dataset. Just try different ones (that’s what we were doing) and see how it works for the kind of processing we do?
Would it help at all to re-chunk in time dimension to the whole time dimension size? In other words, if our dataset has dimensions of t: 11000, x: 800, y: 800, do I need to re-chunk it with t: 11000, x: 10, y: 10? Or what if I re-chunk x and y at their dimension size since these are fixed values for the whole dataset?
Also it would seem that increasing chunk size for the x and y would help with the access time to all x’s and y’s that belong to the same chunk.
When re-chunking, why do I want to keep the original chunk size of the dataset the same? If previous chunk size is 128Mb, should re-chunked chunk size be also 128Mb? This is in relation to the note you made in the epic post, and I don’t really understand why (to guarantee proximity of new chunks, perhaps)?

Thank you so much for all clarifications and help!

Topic		Replies	Views
Am I thinking about this data processing/chunking workflow correctly? Data	8	1063	June 9, 2023
Xarray slow read on cluster Data machine-learning	4	205	November 3, 2024
Puzzling S3 xarray.open_zarr latency Data	10	2639	August 20, 2021
xr.DataArray.chunks, np.digitize and xr.DataArray.groupby, and dask Science	2	674	January 16, 2022
Extremely slow xarray/zarr writes Data	5	535	August 22, 2024

Extremely slow rechunking of Zarr store with xarray

Related topics