I just spent the last two days at a workshop organized by ESA, where the roadmap for Zarr adoption within the EOPF framework was presented. Below, I will summarize the key takeaways relevant to this discussion until official information is officially released in April:
EOPF (Earth Observation Processing Framework) is the abstraction framework for the new Copernicus Space Component (CSC) Data Processor Re-engineering. It serves as a data model that abstracts product items such as measurements, quality, annotations, and attributes. In theory, this framework would allow for encoding products in multiple containers, including Zarr, GeoTIFF, NetCDF, and SAFE.
The selected baseline data encoding for all missions is Zarr, chosen for various justified reasons such as scalability and cost efficiency. The definition of the specifications initiative began in 2024 and will continue until 2026, when ESA is expected to start operationally producing the new Zarr from the ground segment.
By that time, iterations will need to be made regarding the Zarr specification and partitioning. A key point confirmed by all ESA technical officers is that there will be no Zipped Zarr in the final implementation . The Zipped Zarr format is only a temporary workaround for managing the enormous number of files generated by the combination of the EOPF data model, partitioning, and Zarr structure for a single data product. This is where contributions like sharding will be welcome.
The Sample Service, developed and operated by EODC, will serve as the central point of reference (official opening by April) for all community activities around EOPF Zarr. More information on the Sample Service can be found here: https://zarr.eopf.copernicus.eu/
Additionally, there will be a STAC + Zarr workshop on April 17th. More information to come.
ESA also aims to implement a Discrete Global Grid System (DGGS) for data mapping to reduce current data duplication, which can be as high as 30% due to Sentinel-2’s current UTM gridding.
It is important to note that this initiative is still underway, and ESA has several key matters to address in collaboration with the community before they can begin producing Sentinel data in Zarr on a permanent basis. The key takeaway is that ESA’s current focus is on promoting community engagement and encouraging user adoption, which is a welcomed effort.