Example which highlights the limitations of NetCDF-style coordinates for large geospatial rasters

kirill.kzb · April 3, 2024, 11:47pm

This issue comes up every now and then:

github.com/opendatacube/odc-geo

Reprojection produces a GeoBox that does not match when destination CRS is EPSG 4326

opened 12:45AM - 03 Feb 24 UTC

closed 11:44PM - 05 Feb 24 UTC

fbunt

documentation

I ran into this issue while reprojecting some data but I think the underlying ca…use is that `GeoBox` objects created by different means for the same grid location will have affine transforms that are slightly different. It may be specific to EPSG 4326 since I haven't run into this issue with other projections. When reprojecting to EPSG 4326, the geobox for the result does not match what I input. A short example: ```py import numpy as np import xarray as xr import rioxarray as riox src = xr.DataArray( np.ones((1, 10, 10)), dims=("band", "y", "x"), coords=([1], np.arange(10), np.arange(10)[::-1]), ).rio.write_crs("EPSG:3310") dst_gb = src.odc.geobox.to_crs(4326) dst = src.odc.reproject(dst_gb) print("GeoBox comparison:", dst.odc.geobox == dst_gb) print("Shape comparison:", dst.odc.geobox.shape == dst_gb.shape) print("CRS comparison:", dst.odc.geobox.crs == dst_gb.crs) print("Affine comparison:", dst.odc.geobox.affine == dst_gb.affine) print() print("dst_gb affine") print(repr(dst_gb.affine)) print("result affine") print(repr(dst.odc.geobox.affine)) print() print("Affine comparison with tolerance:", np.allclose(list(dst_gb.affine), list(dst.odc.geobox.affine))) ``` ``` GeoBox comparison: False Shape comparison: True CRS comparison: True Affine comparison: False dst_gb affine Affine(1.0200436314046532e-05, 0.0, -120.00002388788778, 0.0, -1.0200436314046532e-05, 38.01647531889047) result affine Affine(1.0200435637076365e-05, 0.0, -120.00001592863761, 0.0, -1.0200435637131023e-05, 38.01647279736898) Affine comparison with tolerance: True ``` I'm using version 0.4.2.

It’s important to understand that GeoBox is recomputed from coordinates, that’s needed to support slicing into geo-referenced data, and also to support data constructed by other libraries. BUT there is no guarantee that this recomputed GeoBox will produce exactly the same coordinates when used to create a new array from it. Essentially GeoBox -> coords -> GeoBox is not guaranteed to be lossless. It can only be lossless when both resolution and translation components of the Affine matrix are basically integers. Sentinel-2 has that property for example, scale=+/-10 tx,ty=10*N where N is an integer.

Not sure what a proper solution for this should be. We can keep track of the original resolution in an attribute of the coordinate, and use exactly that value when extracting GeoBox from coords (with a check to deal with xx[::10, ::10] type of slicing. I guess we can also keep track of original translation, and only recompute that if array has been sliced. I’ll probably implement that for the next version of odc-geo, actually.

The problem of creating sub-geobox that will be able to produce exactly the same coords as the original geobox in the sliced section is much harder to address. We want this invariant:

gbox[roi].coords == gbox.coords[roi]

That’s not possible unless gbox[roi] retains parent and slice, and essentially return self.parent.coords[self.roi]. Or we implement “rounding to some fraction of a pixel” kinda logic.

Topic		Replies	Views
Can a reprojection/change of CRS operation be done lazily using rioXarray? Science	30	1205	November 7, 2024
Netcdf to Zarr best practices Data	15	11358	September 16, 2025
Tables, (x)arrays, and rasters¶	18	3083	November 15, 2022
What's Next - Software - Geospatial Bridge	1	274	December 12, 2023
Himawari 9 data from netcdf into raster Data	10	302	February 11, 2025

Example which highlights the limitations of NetCDF-style coordinates for large geospatial rasters

Related topics