Ah no wonder, this is bad for a map-reduce groupby. Because there is one element per group (i.e. one data point in each hour) per chunk, the blockwise reduction does nothing (input=output). Then we stitch 4 chunks together (memory use is now 4x chunksize at least), and reduce again (back to 1x chunksize). Now we keep repeating these steps till the end.
You could try “split-reduce” which is the standard xarray thing that would split each chunk to 24 new chunks and run it forward. This is probably too large an increase in tasks to work well.
I would call .chunk({"time": 6")}
so 4x reduction in chunksize; and then use method="cohorts"
so we can get some effective reductions early on in the graph, but obviously better if the zarr dataset was chunked that way to begin with.
Basically for time grouping where groups are periodic with period T, you want chunksize C > T and use “map-reduce”, or C < T and use “cohorts”. If C~T then it’s just bad memory wise (cna we call C/T the flocking number)