Issues pip installing rechunker onto HPC conda environment

Hi Pangeo team,

I’ve recently been trying to use the rechunker package on my institution’s HPC cluster, but am running into some issues. I’ve pip installed the rechunker package onto my conda environment on the cluster, but when I try to import the package I get the following error:

In[1]: import rechunker

Out[1]: ImportError: cannot import name 'encode_zarr_attr_value' from 'xarray.backends.zarr' (/scratch/aeb783/penv/lib/python3.8/site-packages/xarray/backends/zarr.py)

Here my conda environment is located at /scratch/aeb783/penv, and I’ve hidden the callback stack for brevity (though I would be willing to share it if it would be useful).

On the other hand, when I try to pip install rechunker onto my local machine, I have no issues importing the rechunker package and it seems to be working as intended.

I don’t understand why this error is occurring on the HPC cluster. Any ideas for how I can resolve this and get rechunker going on the HPC cluster?

Thanks as always for your help!

Andrew

1 Like

The problem seems related to xarray. Is xarray installed on your HPC environment? If not, can you try installing it?

Xarray should not be a hard dependency for rechunker, but perhaps something has gone wrong with rechunker packaging that needs to be fixed.

It might also be specific to the version of xarray installed. rechunker doesn’t specify a minimum required version, but perhaps should if things like encode_zarr_attr_value aren’t present in older versions of xarray.

Ah, great! Looks like the issue was the Xarray version (it was installed as v0.15.1 on my cluster and updating it to 0.16.1 did the trick.)

Thanks @rabernat and @TomAugspurger! It’s exciting to be so close to doing the types of analyses I want with these tools that you all have developed! :slight_smile:

FYI, I have open an issue in the rechunker package to keep track of this:

Thanks again for reporting.

1 Like