Hi all, I just came across a curious case where xarray is throwing this error on a compressed integer stored in a NetCDF4:
RuntimeWarning: overflow encountered in scalar absolute
vlim = max(abs(vmin - center), abs(vmax - center))
Here is the encoding as seen by xarray:
{‘dtype’: dtype(‘int16’),
‘zlib’: True,
‘szip’: False,
‘zstd’: False,
‘bzip2’: False,
‘blosc’: False,
‘shuffle’: False,
‘complevel’: 1,
‘fletcher32’: False,
‘contiguous’: False,
‘chunksizes’: (151, 79, 118),
‘preferred_chunks’: {‘time’: 151, ‘lat’: 79, ‘lon’: 118},
‘original_shape’: (151, 474, 944),
‘missing_value’: -999,
‘_FillValue’: -999}
Xarray doesn’t seem to be correctly decoding and applying the missing value -999. As a side note, Panoply does. Here the ncdump shown in Panoply:
short CDD(time=151, lat=474, lon=944);
:missing_value = -999S; // short
:_FillValue = -999S; // short
:long_name = “Consecutive Dry Days”;
:units = “days”;
:_ChunkSizes = 151U, 79U, 118U; // uint
xarray is loading the variable as an int64, but since isn’t catching the missing_value correctly the missing grid cells (ie ocean and lake bodies in these data) are loading as -9223372036854775808, which is messing up spatial averaging by not being represented as a NaN. My other compressed variables that have a scale_factor and add_offset are correctly being decoded into floats with missing_value NaNs.
Is the missing_value = -999S as a short forcing xarray to load the variable as an integer rather than a float with NaNs?
This is publicly released data, so I can’t change the source NetCDF files, but maybe I can write a xarray preprocess function to correct the typing.
Thanks for any suggestions.