Compute time series for 70,000 locations (Speed up the processing)

I’ve had good experiences with Dask’s P2P shuffling to reduce RAM usage. Might be worth trying by adjusting the config:

import dask

dask.config.set({"array.rechunk.method": "p2p"})
dask.config.set({"optimization.fuse.active": False})

Not sure if optimization.fuse.active needs to be False. I remember that it solved an issue some while ago and it was recommended in a GitHub issue. Might not be necessary anymore.

Here is a blog post and here a thread in the Pangeo Discourse about it.