Processing large (too large for memory) xarray datasets, and writing to netcdf

An alternative approach, if you know the total number of timesteps in advance, would be to initialize the whole Zarr store at the beginning with a lazy “template” dataset, i.e.

ds_template.to_zarr(tempdir, compute=False)

and then use the region argument to write, i.e. something like

for j in range(ntimesteps):
    dd.to_zarr(tempdir, region={'time': slice(j, j+1)}

This might be faster because it won’t actually open the Zarr store with Xarray each time. That’s what I think is happening with append.