I think they can be unified. From what I can tell by reading these threads and the github, GeoZarr is not set in stone but I think we can figure it out. This is a good thread on some of the challenges
Showing some prototype implementations that can be iterated on and discussed seems like the current next step.
-
lazily reading in many raster images into the same Xarray object without materializing coordinates. The feature in Xarray to enable this has been drafted here: Flexible coordinate transform by benbovy · Pull Request #9543 · pydata/xarray · GitHub
-
roundtripping the CRS data after writing the GeoZarr and reading back to the original Xarray object seems like the next step. This round tripping demonstration has been discussed in a few places or even implemented but not in a generic way within xarray or it’s extensions.
- Discussion : writing xarray-compatible zarr stores · Issue #66 · os-climate/hazard · GitHub
- My thoughts on coordinate · Issue #48 · zarr-developers/geozarr-spec · GitHub
- Roundtrip of geotiff/nc to zarr to geotiff/nc as test ground to find what info needs to be saved · Issue #50 · zarr-developers/geozarr-spec · GitHub
This approach for ironing out the spec has been brought up in a couple threads and I think it’s a good one. It seems like what’s a limiting factor is people power to provide implementations that can be discussed and iterated on.
I saw that November 6th there is a meeting to discuss the GeoZarr spec: GeoZarr Spec Steering Working Group - HackMD. I’ll attend, hope to see many others! I’d like to contribute to protoyping GeoZarr. As a first step I am working with a very sparse 2Gb dataset of Sentinel-2 raster chips across Europe, Eurosat, and trying to make a prototype that addresses points 1. and 2. above.