Workflow suggestions with xarray/xclim and Dask cluster.adapt()

Hi all, I do some heavy processing with the xclim xarray package, but I’ve always been stumped on being able to use Dask cluster.adapt to scale up or down based on HPC availability. The xclim package allows me to parallelize over 1000+ CPUs, but I very often want to process a list of climate models. If I try to use Dask cluster.adapt(), it will ramp up the number of workers in the cluster, but once the currently processing model is complete, all the workers are released. In this situation you’re spending all the time going in and out of the HPC queue. At it’s core, I’m trying to do something highly parallel (xclim) inside a serial for loop (ie processing one climate model at a time). Trying to do everything fully in parallel likely explodes the task graph, so there needs to be some enforced serial processing.

I was wondering if anyone had some tips on how to approach this kind of workflow, while also being able to take advantage of cluster.adapt(). Perhaps there is a way to map over a Dask DataFrame with only one partition so the outer loop is serial, but doesn’t release the workers. A Slurm job array is a great way to process one model at a time using highly parallel xarray tasks, but I was wondering if it could easily be done with Dask cluster.adapt() so you could stay within a Jupyter Notebook. I usually use Dask cluster.scale() for these types of large jobs, but then you need to have a sense of how much wallclock time to request. Dask cluster.adapt() would let you run indefinitely and cycle workers in and out based on HPC availability, which sounds appealing.

Thanks

Hi @jalder,

I am not sure about your workflow, but from the last bit of your description it sounds like this simple wrapper of dask_jobqueue.SLURMCluster I developed might be helpful:

GitHub - maawoo/draco: A custom wrapper around dask-jobqueue's SLURMCluster. · GitHub

It is used by me and M.Sc.-level students on our university’s HPC system, but hopefully works as-is on your slurm-managed HPC as well (feedback / PRs are welcome!). My motivation was to make it easier for the students to work interactively with Jupyter notebooks on the HPC system. We use micromamba / Pixi to manage our Python environments and access the HPC system via VSCode / Positron and Remote-SSH. Slurm resources can then be requested in the first cell of a notebook, e.g.:

from draco import start_slurm_cluster

dask_client, cluster = start_slurm_cluster()

By default 1.5 h walltime on the ‘short’ queue (3h max walltime on our system) is targeted and uses cluster.adapt(…) to keep requesting resources.