(in response to @kirill.kzb)
The only way to handle this data in a generic case (anything but axis aligned imagery)
That’s interesting. I think xarray should not try to “handle” the generic case but simply read what is there, if it’s a geotransform read those six numbers (or represent as four explicitly in the simplest case), ready for materializing regular 1D coord arrays, if it’s gcps/rpcs/geolocation arrays read those too, but as-is. They are for the warper api (or esmf or sim) to use - it seems like you’re suggesting xarray should build-in geospatial-resolving workflows, but I would say that is definitely out of scope. With the geolocation numbers or arrays just represent them and hand them along when the time comes.
Realizing a potential gotcha in what I’m say: (yes, the geotransform case is always automatically handled now, with labelled coord arrays, but that’s the same divide in GDAL itself between Translate and Warp, it’s always geoferenced/axis-aligned in the first instance, but a dataset must stream through the warper for all other geolocation methods - Translate will simply write the array as is un-georeferenced and copy the geolocation arrays to the target (somewhat format-dependent), and the warper will always resolve to an axis-aligned target - in longlat by default - or to a target specified by the user, from extent/dim/resolution/crs - with missing elements from that spec inferred by the suggested-warp heuristics).
As part of this I think xarray should really have a look at some unnecessary extra layers between it and the GDAL library (not a downstream package) and see what consolidation is desirable.
(That’s part of what I’ll be exploring as input to this effort).