Using xhistogram to bin measurements at particular stations

Ok that is much more clear. Thanks for taking the time to write it up.

You’re correct that it is not possible today with xhistogram. The reason is that, as currently implemented, xhistogram relies heavily on the fact that it is easy to just sum up the bin counts from each block of data to reach the total for each. Sum is commutative and associative, so it is trivial to parallelize (and most of the code in xhistogram is about making things play well with dask).

FWIW, the algorithm itself is here and is not that complicated to read.

This is a very long-standing open issue in xarray:

I think that the Flox package by @dcherian supports it. Let’s see what Deepak has to say about this.

1 Like