`save_cog_with_dask`: Cannot convert fill_value 999999 to dtype uint8

Hi there :wave:,

I am currently trying to write COGs via Dask with odc.geo.cog.save_cog_with_dask.
Most of the time, it seems fine, but with the same code, deserialisation issue happen sometimes:

  File "C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\numpy\ma\core.py", line 489, in _check_fill_value
    fill_value = np.asarray(fill_value, dtype=ndtype)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OverflowError: Python integer 999999 out of bounds for uint8

My code is:

cog.save_cog_with_dask(
    xds.copy(data=xds.fillna(nodata).astype(dtype)).rio.set_nodata(nodata),
    str(path),
).compute()

I triple checked, everything is fine in the nodata area, for both rioxarray and odc accessors and encoding.0 in float shouldn’t be an issue as they are safely cast into uint8 data.

Full stack trace

2024-11-22 09:17:34,307 - distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\numpy\ma\core.py”, line 489, in _check_fill_value
fill_value = np.asarray(fill_value, dtype=ndtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OverflowError: Python integer 999999 out of bounds for uint8
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\core.py”, line 175, in loads
return msgpack.loads(
^^^^^^^^^^^^^^
File “msgpack\_unpacker.pyx”, line 194, in msgpack._cmsgpack.unpackb
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\core.py”, line 159, in _decode_default
return merge_and_deserialize(
^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\contextlib.py”, line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\serialize.py”, line 525, in merge_and_deserialize
return deserialize(header, merged_frames, deserializers=deserializers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\serialize.py”, line 452, in deserialize
return loads(header, frames)
^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\serialize.py”, line 66, in dask_loads
return loads(header[“sub-header”], frames)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\numpy.py”, line 219, in deserialize_numpy_maskedarray
return np.ma.masked_array(data, mask=mask, fill_value=fill_value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\numpy\ma\core.py”, line 2968, in new
_data._fill_value = _check_fill_value(fill_value, _data.dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\numpy\ma\core.py”, line 495, in _check_fill_value
raise TypeError(err_msg % (fill_value, ndtype)) from e
TypeError: Cannot convert fill_value 999999 to dtype uint8
2024-11-22 09:17:34,312 - distributed.worker - ERROR - Cannot convert fill_value 999999 to dtype uint8
Traceback (most recent call last):
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\numpy\ma\core.py”, line 489, in _check_fill_value
fill_value = np.asarray(fill_value, dtype=ndtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OverflowError: Python integer 999999 out of bounds for uint8
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\worker.py”, line 2075, in gather_dep
response = await get_data_from_worker(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\worker.py”, line 2881, in get_data_from_worker
response = await send_recv(
^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\core.py”, line 1018, in send_recv
response = await comm.read(deserializers=deserializers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\comm\tcp.py”, line 247, in read
msg = await from_frames(
^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\comm\utils.py”, line 78, in from_frames
res = _from_frames()
^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\comm\utils.py”, line 61, in _from_frames
return protocol.loads(
^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\core.py”, line 175, in loads
return msgpack.loads(
^^^^^^^^^^^^^^
File “msgpack\_unpacker.pyx”, line 194, in msgpack._cmsgpack.unpackb
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\core.py”, line 159, in _decode_default
return merge_and_deserialize(
^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\contextlib.py”, line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\serialize.py”, line 525, in merge_and_deserialize
return deserialize(header, merged_frames, deserializers=deserializers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\serialize.py”, line 452, in deserialize
return loads(header, frames)
^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\serialize.py”, line 66, in dask_loads
return loads(header[“sub-header”], frames)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\protocol\numpy.py”, line 219, in deserialize_numpy_maskedarray
return np.ma.masked_array(data, mask=mask, fill_value=fill_value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\numpy\ma\core.py”, line 2968, in new
_data._fill_value = _check_fill_value(fill_value, _data.dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\numpy\ma\core.py”, line 495, in _check_fill_value
raise TypeError(err_msg % (fill_value, ndtype)) from e
TypeError: Cannot convert fill_value 999999 to dtype uint8
2024-11-22 09:17:34,367 - distributed.core - ERROR - Exception while handling op get_data
Traceback (most recent call last):
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\core.py”, line 834, in _handle_comm
result = await result
^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\core.py”, line 981, in wrapper
return await func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\worker.py”, line 1798, in get_data
assert response == “OK”, response
^^^^^^^^^^^^^^^^
AssertionError: {‘op’: ‘get_data’, ‘keys’: {(‘from_sequence-9bfcd20b0840146be3c99f402b3807dc’, 83)}, ‘who’: ‘tcp://127.0.0.1:57260’, ‘reply’: True}
2024-11-22 09:17:34,372 - distributed.worker - ERROR - {‘op’: ‘get_data’, ‘keys’: {(‘from_sequence-9bfcd20b0840146be3c99f402b3807dc’, 83)}, ‘who’: ‘tcp://127.0.0.1:57260’, ‘reply’: True}
Traceback (most recent call last):
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\worker.py”, line 2075, in gather_dep
response = await get_data_from_worker(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\worker.py”, line 2881, in get_data_from_worker
response = await send_recv(
^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\core.py”, line 1043, in send_recv
raise exc.with_traceback(tb)
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\core.py”, line 834, in _handle_comm
result = await result
^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\core.py”, line 981, in wrapper
return await func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\distributed\worker.py”, line 1798, in get_data
assert response == “OK”, response
^^^^^^^^^^^^^^^^^
AssertionError: {‘op’: ‘get_data’, ‘keys’: {(‘from_sequence-9bfcd20b0840146be3c99f402b3807dc’, 83)}, ‘who’: ‘tcp://127.0.0.1:57260’, ‘reply’: True}

Does somebody ever encountered something like that?

Sorry it’s difficult to adress a minimum workable example as the rasters I’m using are very heavy (and commercial data).

Conda list

conda list

packages in environment at C:\Users\rbraun\Anaconda3\envs\eoprocesses:

Name Version Build Channel

affine 2.4.0 pypi_0 pypi
art 6.2 pypi_0 pypi
asciitree 0.3.3 pypi_0 pypi
astroid 3.2.4 pypi_0 pypi
attrs 23.2.0 pypi_0 pypi
azure-core 1.30.2 pypi_0 pypi
azure-storage-blob 12.21.0 pypi_0 pypi
backports-tarfile 1.2.0 pypi_0 pypi
beautifulsoup4 4.12.3 pypi_0 pypi
black 24.8.0 pypi_0 pypi
bokeh 3.5.1 pypi_0 pypi
boto3 1.34.149 pypi_0 pypi
botocore 1.34.149 pypi_0 pypi
bzip2 1.0.8 h2466b09_7 conda-forge
ca-certificates 2024.8.30 h56e8100_0 conda-forge
cachetools 5.4.0 pypi_0 pypi
certifi 2024.7.4 pypi_0 pypi
cffi 1.16.0 pypi_0 pypi
cfgv 3.4.0 pypi_0 pypi
charset-normalizer 3.3.2 pypi_0 pypi
click 8.1.7 pypi_0 pypi
click-plugins 1.1.1 pypi_0 pypi
cligj 0.7.2 pypi_0 pypi
cloudpathlib 0.18.1 pypi_0 pypi
cloudpickle 3.0.0 pypi_0 pypi
color-operations 0.1.6 pypi_0 pypi
colorama 0.4.6 pypi_0 pypi
colorlog 6.8.2 pypi_0 pypi
configobj 5.0.8 pypi_0 pypi
contourpy 1.2.1 pypi_0 pypi
coverage 7.6.1 pypi_0 pypi
cryptography 43.0.0 pypi_0 pypi
cycler 0.12.1 pypi_0 pypi
daal 2024.5.0 pypi_0 pypi
daal4py 2024.5.0 pypi_0 pypi
dask 2024.11.2 pypi_0 pypi
dask-expr 1.1.19 pypi_0 pypi
dicttoxml 1.7.16 pypi_0 pypi
dill 0.3.8 pypi_0 pypi
distlib 0.3.8 pypi_0 pypi
distributed 2024.11.2 pypi_0 pypi
docutils 0.21.2 pypi_0 pypi
donfig 0.8.1.post1 pypi_0 pypi
earthengine-api 0.1.413 pypi_0 pypi
ee-extra 0.0.15 pypi_0 pypi
eemont 0.3.6 pypi_0 pypi
eoreader 0.21.7 pypi_0 pypi
eosets 0.2.5 pypi_0 pypi
fasteners 0.19 pypi_0 pypi
filelock 3.15.4 pypi_0 pypi
flake8 7.1.1 pypi_0 pypi
fonttools 4.53.1 pypi_0 pypi
fsspec 2024.6.1 pypi_0 pypi
geographiclib 2.0 pypi_0 pypi
geopandas 1.0.1 pypi_0 pypi
geopy 2.4.1 pypi_0 pypi
google-api-core 2.19.1 pypi_0 pypi
google-api-python-client 2.138.0 pypi_0 pypi
google-auth 2.32.0 pypi_0 pypi
google-auth-httplib2 0.2.0 pypi_0 pypi
google-cloud-core 2.4.1 pypi_0 pypi
google-cloud-storage 2.18.0 pypi_0 pypi
google-crc32c 1.5.0 pypi_0 pypi
google-resumable-media 2.7.1 pypi_0 pypi
googleapis-common-protos 1.63.2 pypi_0 pypi
h5netcdf 1.3.0 pypi_0 pypi
h5py 3.11.0 pypi_0 pypi
httplib2 0.22.0 pypi_0 pypi
identify 2.6.0 pypi_0 pypi
idna 3.7 pypi_0 pypi
imageio 2.34.2 pypi_0 pypi
importlib-metadata 8.2.0 pypi_0 pypi
iniconfig 2.0.0 pypi_0 pypi
isodate 0.6.1 pypi_0 pypi
isort 5.13.2 pypi_0 pypi
jaraco-classes 3.4.0 pypi_0 pypi
jaraco-context 5.3.0 pypi_0 pypi
jaraco-functools 4.0.1 pypi_0 pypi
jinja2 3.1.4 pypi_0 pypi
jmespath 1.0.1 pypi_0 pypi
joblib 1.4.2 pypi_0 pypi
jsonschema 4.23.0 pypi_0 pypi
jsonschema-specifications 2023.12.1 pypi_0 pypi
keyring 25.2.1 pypi_0 pypi
kiwisolver 1.4.5 pypi_0 pypi
lazy-loader 0.4 pypi_0 pypi
libexpat 2.6.4 he0c23c2_0 conda-forge
libffi 3.4.2 h8ffe710_5 conda-forge
libsqlite 3.47.0 h2466b09_1 conda-forge
libzlib 1.3.1 h2466b09_1 conda-forge
locket 1.0.0 pypi_0 pypi
lxml 5.3.0 pypi_0 pypi
lz4 4.3.3 pypi_0 pypi
markdown-it-py 3.0.0 pypi_0 pypi
markupsafe 2.1.5 pypi_0 pypi
matplotlib 3.9.2 pypi_0 pypi
mccabe 0.7.0 pypi_0 pypi
mdurl 0.1.2 pypi_0 pypi
methodtools 0.4.7 pypi_0 pypi
more-itertools 10.3.0 pypi_0 pypi
msgpack 1.0.8 pypi_0 pypi
mypy-extensions 1.0.0 pypi_0 pypi
networkx 3.3 pypi_0 pypi
nh3 0.2.18 pypi_0 pypi
nodeenv 1.9.1 pypi_0 pypi
numcodecs 0.13.0 pypi_0 pypi
numpy 2.0.1 pypi_0 pypi
odc-geo 0.4.8 pypi_0 pypi
openssl 3.4.0 h2466b09_0 conda-forge
packaging 24.1 pypi_0 pypi
pandas 2.2.2 pypi_0 pypi
partd 1.4.2 pypi_0 pypi
pathspec 0.12.1 pypi_0 pypi
pillow 10.4.0 pypi_0 pypi
pip 24.3.1 pyh8b19718_0 conda-forge
pkginfo 1.10.0 pypi_0 pypi
platformdirs 4.2.2 pypi_0 pypi
pluggy 1.5.0 pypi_0 pypi
pre-commit 3.8.0 pypi_0 pypi
proto-plus 1.24.0 pypi_0 pypi
protobuf 5.27.2 pypi_0 pypi
psutil 6.0.0 pypi_0 pypi
pyarrow 17.0.0 pypi_0 pypi
pyarrow-hotfix 0.6 pypi_0 pypi
pyasn1 0.6.0 pypi_0 pypi
pyasn1-modules 0.4.0 pypi_0 pypi
pycm 4.0 pypi_0 pypi
pycodestyle 2.12.0 pypi_0 pypi
pycparser 2.22 pypi_0 pypi
pyflakes 3.2.0 pypi_0 pypi
pygments 2.18.0 pypi_0 pypi
pykdtree 1.3.12 pypi_0 pypi
pylint 3.2.6 pypi_0 pypi
pyogrio 0.9.0 pypi_0 pypi
pyparsing 3.1.2 pypi_0 pypi
pyproj 3.6.1 pypi_0 pypi
pyresample 1.29.0 pypi_0 pypi
pystac 1.10.1 pypi_0 pypi
pytest 8.3.2 pypi_0 pypi
pytest-cov 5.0.0 pypi_0 pypi
python 3.11.10 hce54a09_3_cpython conda-forge
python-box 7.2.0 pypi_0 pypi
python-dateutil 2.9.0.post0 pypi_0 pypi
pytz 2024.1 pypi_0 pypi
pywin32-ctypes 0.2.2 pypi_0 pypi
pyyaml 6.0.1 pypi_0 pypi
rasterio 1.3.10 pypi_0 pypi
readme-renderer 44.0 pypi_0 pypi
referencing 0.35.1 pypi_0 pypi
requests 2.32.3 pypi_0 pypi
requests-toolbelt 1.0.0 pypi_0 pypi
rfc3986 2.0.0 pypi_0 pypi
rich 13.7.1 pypi_0 pypi
rioxarray 0.17.0 pypi_0 pypi
rpds-py 0.19.1 pypi_0 pypi
rsa 4.9 pypi_0 pypi
rtree 1.3.0 pypi_0 pypi
s3transfer 0.10.2 pypi_0 pypi
scikit-image 0.24.0 pypi_0 pypi
scikit-learn 1.5.1 pypi_0 pypi
scikit-learn-intelex 2024.5.0 pypi_0 pypi
scipy 1.14.0 pypi_0 pypi
seaborn 0.13.2 pypi_0 pypi
sertit 1.43.1.dev6 pypi_0 pypi
sertit-models 0.4.1 pypi_0 pypi
setuptools 71.0.4 pyhd8ed1ab_0 conda-forge
shapely 2.0.5 pypi_0 pypi
six 1.16.0 pypi_0 pypi
snuggs 1.4.7 pypi_0 pypi
sortedcontainers 2.4.0 pypi_0 pypi
soupsieve 2.5 pypi_0 pypi
spyndex 0.6.0 pypi_0 pypi
tbb 2021.13.0 pypi_0 pypi
tblib 3.0.0 pypi_0 pypi
tempenv 2.0.0 pypi_0 pypi
threadpoolctl 3.5.0 pypi_0 pypi
tifffile 2024.7.24 pypi_0 pypi
tk 8.6.13 h5226925_1 conda-forge
tomlkit 0.13.0 pypi_0 pypi
toolz 0.12.1 pypi_0 pypi
tornado 6.4.1 pypi_0 pypi
tqdm 4.66.4 pypi_0 pypi
twine 5.1.1 pypi_0 pypi
typing-extensions 4.12.2 pypi_0 pypi
tzdata 2024.1 pypi_0 pypi
ucrt 10.0.22621.0 h57928b3_0 conda-forge
uritemplate 4.1.1 pypi_0 pypi
urllib3 2.2.2 pypi_0 pypi
validators 0.33.0 pypi_0 pypi
vc 14.3 h8a93ad2_20 conda-forge
vc14_runtime 14.40.33810 ha82c5b3_20 conda-forge
virtualenv 20.26.3 pypi_0 pypi
vs2015_runtime 14.40.33810 h3bf8584_20 conda-forge
wheel 0.43.0 pyhd8ed1ab_1 conda-forge
wirerope 0.4.7 pypi_0 pypi
xarray 2024.7.0 pypi_0 pypi
xgboost 2.1.0 pypi_0 pypi
xyzservices 2024.6.0 pypi_0 pypi
xz 5.2.6 h8d14728_0 conda-forge
zarr 2.18.2 pypi_0 pypi
zict 3.0.0 pypi_0 pypi
zipp 3.19.2 pypi_0 pypi

NB: It’s a repost from `save_cog_with_dask`: Cannot convert fill_value 999999 to dtype uint8 - Dask Forum
I don’t think (or know) if it’s a bug, sio I rather start the discussion here before creating an issue there.

It appears that you are attempting to create an array of type uint8, which can only hold integer values in the range [0, 255], but attempting to insert the value 999999, which is far outside of that range. You either need to choose a value within that range, or use a dtype with a range that includes your large value.

This is where I don’t understand, where does this value 999999 comes from?
You can see on the screenshot that all the nodata values are set to 0.

To help us help you, please share the output of

data = xds.fillna(nodata).astype(dtype)).rio.set_nodata(nodata)
print(data.info())

Hi @rabernat !

I have a DataArray, not a Dataset, so the info() function doesn’t seem to exist :thinking:

Here is the print of the DataArray:

<xarray.DataArray 'COG of EMSR773_AOI17_GRA_CONSOLIDATION_AERIAL_20241111_1300_ORTHO_cog' (
                                                                                           band: 3,
                                                                                           y: 41909,
                                                                                           x: 41362)> Size: 5GB
dask.array<astype, shape=(3, 41909, 41362), dtype=uint8, chunksize=(1, 2048, 2048), chunktype=numpy.ndarray>
Coordinates:
  * band         (band) int32 12B 1 2 3
  * x            (x) float64 331kB -5.481e+04 -5.481e+04 ... -4.24e+04 -4.24e+04
  * y            (y) float64 335kB 4.754e+06 4.754e+06 ... 4.741e+06 4.741e+06
    spatial_ref  int32 4B 0
    quantile     float64 8B 0.02
Attributes:
    long_name:  ['RED', 'GREEN', 'BLUE']

This is where I don’t understand, where does this value 999999 comes from?

This is most likely coming from NumPy’s defaults for masked arrays - numpy.ma.default_fill_value — NumPy v2.1 Manual.

Based on the traceback mentioning get_data and deserialize I think the actual error is probably with reading the data from the original file rather than the writing process. What happens if you call a .load early in the workflow? Do you get the same error?

You mean do a load, recreate a dask array from the loaded array and resave it with save_cog_with_dask?

My suggestion was a debugging approach rather a final solution.

If there’s no existing shared solution, no one from these forums has encountered the exact same solution (thank you for asking!), and the data cannot be shared, I think you’ll probably need to do some digging to find out what is causing the issue. I was suggesting a mechanism for narrowing down where in the workflow it’s happening, by seeing if you get similar errors with:

# Test just opening the dataset
xds.load()
# Test filling missing data
xds.fillna(nodata).load()
...

Of course you may have already tried this all and only find the issue when passing the output from all the piped operations to save_cog_with_dask, so apologies if my suggestion was unhelpful.

What I can say is that is works like a charm when I save this without Dask, with rioxarray (after a compute), with to_raster(..., driver="COG", windowed=True).
It also seems to work with numpy < 2.

However, it fails with numpy >2.
I tested also with numpy 2.1.0, but the error is slightly different (same topic though)
TypeError: Cannot cast scalar from dtype('int64') to dtype('uint8') according to the rule 'same_kind'

Traceback

File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\sertit\rasters.py”, line 258, in wrapper
raise ex
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\sertit\rasters.py”, line 254, in wrapper
out = function(any_raster_type, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\sertit\rasters.py”, line 1181, in write
).compute()
^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\dask\base.py”, line 372, in compute
(result,) = compute(self, traverse=False, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\dask\base.py”, line 660, in compute
results = schedule(dsk, keys, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\dask\array\reductions.py”, line 477, in chunk_max
return np.max(x, axis=axis, keepdims=keepdims)
^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\numpy_core\fromnumeric.py”, line 3199, in max
return _wrapreduction(a, np.maximum, ‘max’, axis, None, out,
^^^^^^^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\numpy_core\fromnumeric.py”, line 84, in _wrapreduction
return reduction(axis=axis, out=out, **passkwargs)
^^^^^^^^^^^
File “C:\Users\rbraun\Anaconda3\envs\eoprocesses\Lib\site-packages\numpy\ma\core.py”, line 6091, in max
np.copyto(result, result.fill_value, where=newmask)
^^^^^^^
TypeError: Cannot cast scalar from dtype(‘int64’) to dtype(‘uint8’) according to the rule ‘same_kind’

My idea is that it doesn’t fail with numpy < 2 because such newly illegal casting was possible before, but not anymore.
It is therefore more likely an issue and I’ll go to odc-geo creating one.

Thanks for the help :pray:

I created the issue at odc-geo and I worked on a minimal working example after all (you can find it there, in order to avoid creating multiple parallel threads about this):
'save_cog_with_dask' fails with numpy > 2 · Issue #189 · opendatacube/odc-geo · GitHub

1 Like