Webinar - Building a Planetary-Scale Earth-Observation Data Cube in Zarr

As regulars on this forum will note, I’ve recently gone on a deep dive into how to efficiently build massive-scale harmonized earth-observation data cubes in Zarr using open data. These workflows are common at big satellite data providers and processors (think NASA, Planet, etc.) But thanks to open data and serverless cloud computing, it’s now feasible for teams of all sizes to build their own customized data cubes for Earth analytics and AI.

I’ve been digging into the details and best practices on recent forum posts, e.g.

I’ve learned a lot from this and am excited to share these learnings with the community. I’ll also compare three different serverless execution frameworks for running the pipeline (Coiled Functions, Modal, and Lithops) and look at some of the tradeoffs.

Looking forward to a fun discussion… All are welcome!


I have been tinkering with Lithops - so interested to have what you have come up with there.

Webinar will definitely be recorded! No need to take extreme measures! :laughing:


The official recording is now on youtube:


@strobpr - While it didn’t make it into the slides, you’ll notice at the end a discussion around coordinate systems. My dream is that we could be using DGGS for these data cubes instead of simple rectangular lat lon grids. :wink:

I’d love to pick your brain on how to best accomplish that.

Hi @rabernat, sincere apologies for the silence! I thought I’d manage to listen to your talk within a reasonable time which during May just didn’t materialise. The role of grids (which are what is ultimately the topic - not coordinate systems or projections) in data cubes are a ‘hot’ issue, starting with the question of what grids actually are (grids of ‘nodes’ or ‘cells’?) to whether cubes can live without them. But that’s the theoretical part. In practice the question is which grids should we use when dealing with continental to global data and what is the role of resampling in interoperability (blessing or curse?). I’d love to discuss all that but it’s hard to find an interested audience as most want something that works and just move on. DGGS requires a bit more stamina, but to me it seems the perfect match to global data cubing.