Xarray needs special treatment for bounds variables?

Hi all, I think I have made an unfortunate though perhaps not unsurprising discovery about about xarray.

Our use case is in regional climate projections, where we want to look at projected change from historical. So we make climatologies for historical and projection scenarios (using CMIP5 data in this case), and to find the absolute climatological change, take the difference.
The workflow is:

  • create the climatologies from CMIP5 input data
  • load each climatology as an xarray dataset
  • take the difference fut_clim - hist_clim
    This produces an absolute change climatology as expected, however, other things fall over downstream. Eventually we worked out this is because the lat_bnds and lon_bnds variables are all 0s. Why are they zeros? I think because we took a difference in xarray that simply subtracted every variable element-wise. For the data variables, this is what we want. However, _bnds variables are “special” in that they’re related to dimensions, they are not true variables.
    A workaround will just be to re-insert the original bounds values, however, a colleague suggested that maybe I should raise this as an xarray “bug”, in that while it’s expected it’s probably not the behaviour we want for this type of “meta variable”.

Does anyone else have any thoughts on this?
Do you view it as a bug or a thing the user should deal with?
Is it expected behaviour to your mind?
Another option is to delete the _bnds variables, but they are required by CF convention so we don’t want to do that.

cheers
Claire

1 Like

if you set the bounds variables as coordinates on the dataset (ds.set_coords(VARIABLE_NAMES)), then things should work.

2 Likes

Oh neat, thank you @dcherian!