Xarray MemoryError: with groupby workloads

@Michael_Sumner can you post a short example with comments on what you want please? I’m having trouble understanding the question.

Calling .chunk inserts a rechunking task in the task graph. In some sense, it is materialized immediately in that those chunk sizes apply to any downstream operation.

One complication you might run in to here is that open_mfdataset’s chunks only applies on a per-file basis. So if you try open_mfdataset(..., chunks=TimeResampler(freq="AS"))while reading in a daily dataset, it won’t do any thing. You’ll simply have to apply the chunking after open_mfdataset.

In the OP’s example, each file contains many decades of data, so you can specify this directly in open_dataset now.

1 Like