The easiest solution for us (a startup providing a product based on satellite data) is to rely on STAC + COG. It’s easy to handle and, from my perspective, covers almost all the use cases.
Retrieving data (by selecting the relevant pixels) can be done using odc.stac
and rioxarray
.
Visualization can be achieved by relying on titiler
, as well as by directly adding the URL into QGIS (which works great for quick visualization), and large-scale processing can easily be handled using coiled
(or HPC).
However, we haven’t succeeded in making it work for one use case:
→ Retrieving the time series of all Sentinel-2 data over more than 60,000 points.
We used xvec
(a pretty awesome library), but it was still too slow…
I’m not sure if zarr
would be a better candidate for this use case.
Regarding ML, I don’t know if batching COG (with xbatch
) would have the same capabilities as zarr
. I think not, but I’m not sure how many users will try it.