Comparing odc.stac.load and stackstac for raster composite workflow

dschneiderch · April 11, 2025, 3:45pm

Just joining this thread, but I think this point is incredibly important. You need to know how coords are specified in the metadata! Nice job illustrating the fuzziness in the example!

ZZMitch · April 12, 2025, 10:57pm

I ran a similar test for stackstac.stac and it appears that this difference does not apply to xy_coords?

The four images above were built from code like this:

l30_stack = ss.stack(itemsL30, 
                    assets = ['B05'], 
                    epsg = epsg, # epsg (3979 if set to poly) # epsg (32614 if set to UTM)
                    resolution = 30, 
                    bounds = bbox, 
                    resampling = Resampling.bilinear, # bilinear # cubic # nearest
                    chunksize = (1, -1, -1, -1), 
                    xy_coords = 'topleft', # center # topleft
                    gdal_env = gdalEnv)

With just changes to xy_coords and epsg (keeping to local UTM or reprojecting to national proj, as I do in my work). All of these used bilinear resampling, but same patterns held true for cubic and nearest neighbor.

As you can see, xy_coords does not impact “blurriness” like anchor does. The only change is the values are shifted half a pixel between topleft and center. The blurriness in this case occurs when reprojecting from the local UTM grid to a national projection - as would be expected (and required for wide-area analysis when you need everything projected onto the same pixel grid).

So that begs the question, is topleft or center the “correct” pixel placement? I though it would be topleft for HLS, as noted in the User Guide and mentioned by @Andrew_Mullen. But it actually appears that center is most accurate to the real world:

In these comparisons, left is topleft and right is center. These are using the national projection, not UTM, but the patterns are the same.

So I guess the takeaway is that for stackstac that xy_coords = center is the proper way, at least for HLS. I guess this is because it integrates with xarray, which likes pixel center coordinates. And that it is not work the same as anchor does with odc.stac.load?

EDIT: I also played around some with snap_bounds = False in stackstac.stack and it does impact pixel values, but I can’t really tell if its preferred over the default (snap_bounds = True) if already reprojecting?

Andrew_Mullen · April 13, 2025, 6:24am

I did a bit more digging into how odc-stac and stackstac handle bounding boxes, resampling, anchor (odc-stac) and xy_coords (stackstac). I could be wrong here, so anyone more knowledgeable should correct me.

In both odc-stac and stackstac, the pixel snapping parameter (anchor in odc-stac and xy_coords in stackstac) only controls the output grid coordinates, not the actual reading of the data.

In odc-stac when a bbox is specified, a GeoBox object is created to define the output grid based on the bbox coordinates, crs, and resolution. The exact yx coordinates of the destination GeoBox are shifted if anchor is edge as opposed to center. When the data is read it is resampled to fit into the destination GeoBox if the grid misalignment is above a certain tolerance (looks like 5%). So based on this it seems like the data is always resampled, the extent of which is dependent on how close the bbox or GeoBox boundaries are to the source dataset pixel boundaries. In my example, both HLS and Landsat were both in the same UTM crs and my bbox must have luckily aligned relatively well with the source pixel edges. Unfortunately, I do not see a way around this resampling in odc-stac other than ensuring the the bbox is aligned with the source dataset.

In stackstac, the data is only resampled when reprojecting to a different crs or changing the resolution of the output dataset, which makes sense why there was no blurriness in your figures.

Looking at the spatial alignment with the high-res imagery in your figures, with xy_coords='topleft', the output grid is shifted but indeed xarray interprets the coordinates as pixel centers (pixels in left plots shifted up and to the left relative to pixels in right plots). With xy_coords='center', it looks like the underlying data reader correctly interprets the source coordinates of the HLS data (hopefully) and remaps them to pixel center coordinates for xarray.

So now I am inclined to believe that in both odc-stac and stackstac, pixel snapping to ‘center’ is correct regardless of the satellite product. With stackstac there should be no concern of data being resampled out of its original form unless changing crs or resolution, but with odc-stac the data will always be resampled unless the provided bbox or GeoBox aligns with the pixel edges from the source dataset. I will produce some more plots to confirm this odc-stac behavior though.

Topic		Replies	Views
STAC -> Xarray Loading Best Practices Data	3	517	March 13, 2024
Accessing Metadata as Variables in Xarray Datasets Using odc-stac? Data	4	170	September 4, 2024
Proposal: Expanding the xstac python tool to automate a few more of the hard parts Meta zarr	2	126	March 27, 2025
Compute time series for 70,000 locations (Speed up the processing)	12	265	October 7, 2024
Tables, (x)arrays, and rasters¶	18	2889	November 15, 2022

Comparing odc.stac.load and stackstac for raster composite workflow

Related topics