Return a 3D object alongside 1D object in apply_ufunc

Currently, I have something similar to this, where the input_lat is transformed to new_lat (here, +0.25, but in real use case, it’s indeterministic).

Since xarray_ufunc doesn’t return a dataset with actual coordinates values, I had to return a second output to retain new_lat to re-assign the values, but this second output is shaped time, lat, lon so I have to ds["lat"] = new_lat.isel(lon=0, time=0).values, which I think is inefficient; I simply need it to be shaped lat.

Any ideas on how I can modify this to make it more efficient?

import xarray as xr
import numpy as np

air = xr.tutorial.open_dataset("air_temperature")["air"]
input_lat = np.arange(20, 45)

def interp1d_np(data, base_lat, input_lat):
    new_lat = input_lat + 0.25
    return np.interp(input_lat, base_lat, data), new_lat

ds, new_lat = xr.apply_ufunc(
    interp1d_np,  # first the function
    air,
    air.lat,  # as above
    input_lat,  # as above
    input_core_dims=[["lat"], ["lat"], ["lat"]],  # list with one entry per arg
    output_core_dims=[["lat"], ["lat"]],  # returned data has one dimension
    exclude_dims=set(("lat",)),  # dimensions allowed to change size. Must be a set!
    vectorize=True,  # loop over non-core dims
)
new_lat = new_lat.isel(lon=0, time=0).values
ds["lat"] = new_lat

Edit: updating xarray version fixes this dask issue below.

~Also, I’m wondering why once I convert this to dask, it errors out with the following:~

File ~/miniconda3/envs/czi/lib/python3.10/site-packages/dask/array/gufunc.py:492, in <genexpr>(.0)
    490 for i, (ocd, oax, meta) in enumerate(zip(output_coredimss, output_axes, metas)):
    491     print(core_shapes)
--> 492     core_output_shape = tuple(core_shapes[d] for d in ocd)
    493     core_chunkinds = len(ocd) * (0,)
    494     output_shape = loop_output_shape + core_output_shape

KeyError: 'dim0'

Seems like a bug, but it might just be me using it wrong; I printed out the core_shapes:
{'dim0_0': 25, 'dim0_1': 25, 'dim0_2': 25}

import xarray as xr
import numpy as np

air = xr.tutorial.open_dataset("air_temperature")["air"].chunk(time=-1)
input_lat = np.arange(20, 45)

def interp1d_np(data, base_lat, input_lat):
    new_lat = input_lat + 0.25
    return np.interp(input_lat, base_lat, data), new_lat

ds, new_lat = xr.apply_ufunc(
    interp1d_np,  # first the function
    air,
    air.lat,  # as above
    input_lat,  # as above
    input_core_dims=[["lat"], ["lat"], ["lat"]],  # list with one entry per arg
    output_core_dims=[["lat"], ["lat"]],  # returned data has one dimension
    exclude_dims=set(("lat",)),  # dimensions allowed to change size. Must be a set!
    vectorize=True,  # loop over non-core dims
    dask="parallelized"
)
new_lat = new_lat.isel(lon=0, time=0).values
ds["lat"] = new_lat

That error seems suspicious, or at least unclear - do you want to raise this on the xarray issue tracker?

I was about to raise an issue, but as I was completing it, it asked me to post versions, and I realized I was using an outdated version (mismatch likely between dask + xarray; I tracked it down to _enumerate). So, I updated, and the error went away.

I’m still wondering about returning a 1D object though (or even better, getting back the new_lat assigned automatically)

Fairly sure this isn’t possible yet without explicitly specifying the broadcast dimensions as core dimensions on all variables. Can you open an issue please?

Thanks for the responsiveness; here’s the issue: Return a 3D object alongside 1D object in apply_ufunc · Issue #8695 · pydata/xarray · GitHub