Comparing odc.stac.load and stackstac for raster composite workflow

Alex_Leith · March 12, 2024, 3:58am

I’m a bit stubbory, and stick to the OpenDataCube way of working, and you’ve explained most of the differences above, but there are a few things that I would do differently.

Firstly, with odc.stac.load there’s an option to pass in groupby="solar_day", which will merge adjacent scenes and do de-duplication of overlapping pixels. This is pretty important, as it merges satellite passes into single layers, removing the number of scenes to do the median from. I’m honestly not sure how stacstack handles this.

Second, with masking, we usually explicitly pick cloud and cloud shadow and then do a few operations to expand/contract that, to clean up single pixels and fill holes.

A simple example of this is this code, untested:

from odc.algo import erase_bad, mask_cleanup

CLOUD_SHADOWS = 3
CLOUD_HIGH_PROBABILITY = 9

for field in [CLOUD_SHADOWS, CLOUD_HIGH_PROBABILITY]:
    bitmask |= 1 << field

cloud_mask = xr.SCL.astype("uint16") & bitmask != 0

cloud_mask = mask_cleanup(cloud_mask, [("closing", 5), ("opening", 5)])

final_data = erase_bad(xr, cloud_mask)

Note that the odc.algo tools there are important, as they retain the int values in the data, which makes a difference when. you’re working with a lot of data. Using xarray.where converts data to float, which is nice because it means you can use nan for nodata, but sometimes it’s unwanted.

This process is fairly well tested and rigorous, and we used it to do the cloud-free mosaics over Africa, for example. (With variations on the closing/opening logic for tweaking the cloud mask.)

A worked example of the above workflow is available here using HLS data.

And again, I wanted to note that I think there’s a gap in some of the EO tooling now in that we should have a reasonably simple, agreed way to mask clouds the right way, and since we really have two main optical products (Landsat and Sentinel-2), we can make a lot of the parameters for this opinionated.

And I’m looking for support to write this tooling! Maybe I should start a cloud masking crowd funding mini-project!

Topic		Replies	Views
STAC -> Xarray Loading Best Practices Data	3	549	March 13, 2024
Accessing Metadata as Variables in Xarray Datasets Using odc-stac? Data	4	193	September 4, 2024
Proposal: Expanding the xstac python tool to automate a few more of the hard parts Meta zarr	2	149	March 27, 2025
Compute time series for 70,000 locations (Speed up the processing)	12	306	October 7, 2024
Tables, (x)arrays, and rasters¶	18	2918	November 15, 2022

Comparing odc.stac.load and stackstac for raster composite workflow

Related topics