Best practices for large scale (Sentinel-2) mosaics and 4D ML patches using Pangeo tools

Welcome Rosky,

I believe there may be some useful info in this other pangeo post HLS time series using xarray best practices (mostly about some common issues about processing long time series over large extents with RS data).

In short, it may be that you need to exploit some structure of the data to deal with it more efficiently and not give dask a nightmare graph. For instance, you could process it for each Sentinel-2 MGRS (military grid reference system) tile. Also, Sentinel-2 has an average revisit time of about 5 days, so trying to do daily mosaics will have a lot of nans around, so depending on your end requirements, you can make your processing better match the underlying structure of the data.

1 Like