Chunk size for reading writing netcdfs


Is there any best practice for choosing chunk sizes when:

  1. Using xr.open_mfdataset to open 20 years of daily netcdf files (7305 files in total, leading to a dataset with dimensions: time: 7305, x: 1224, y: 1090). Dataset has two variables.
  2. Doing some resampling over dimension time, e.g. monthly means
  3. Writing the entire dataset from step 1 back to disk as one netcdf file (instead of 7305 individual files)
  4. Writing the result from step 2 to disk as one netcdf file

The data is of dtype float32. All this is run on a simple standalone PC with 16 GB of RAM. If I don’t specify the chunk size it takes about 23 minutes to write step 3 to disk, which I find rather time consuming. The netcdf file size of step 3 is about 36 GB.

Any advice on how this can done best?