What's Next - Software - Geospatial Bridge

This is part of a follow-on conversation from the “What’s next for Pangeo” discussion that took place 2023-12-06. It is part of the overarching software topic.

I don’t know this topic well, so I’m mostly copying in things that others have said. I hope that they can carry the conversation

From original document

There is still no standard way to associate geospatial coordinate information (CRS + coordinates) with Xarray data. The bridge between the geospatial raster world (e.g. geotiff, rasterio) and Xarray is fragile

GeoXarray was supposed to resolve these problems
https://github.com/geoxarray/geoxarray
Is it being developed at all? Why not?

From @sharkinsspatial

I’d love to see a consolidated roadmap in xarray that describes the path forward for CRS flexible indexing and CRS serialization and how these can be propagated to storage formats. Right now it feels like there are several independent efforts spread across several issues/repos that makes it difficult to grok the path forward.

From @Michael_Sumner

There could be a grand synthesis of model data structures and “raster geospatial” ones, currently the xarray-everything model misses some key ideas I think and, Zarr doesn’t appeal to me as a way to store data (no overviews). I think the model-netcdf structure vs GeoTIFF alike needs a serious rethink

  • For CRS make sure to look at what Arrow is doing for vector data, stuff already in GDAL but scattered in community understanding and visibility
2 Likes

use WKT, find out the pain points in standard software (GDAL and PROj comprise the benchmark, not any of the downstream python bindings)