Help coregistering MODIS LST imagery and Daymet

Hello,

New user here, perhaps with a beginner question :slight_smile:

I am trying to collect Daymet data and MODIS LST imagery in one xarray.Dataset from Jan 1 to July 1, 2020 for an arbitrary 0.5 x 0.5 degree area. Both of these sources have a native resolution of 1km, and I’d like the result to be at that resolution as well.

I’m working through microsoft planetary computer, my approach is coming from their tutorials. Daymet comes as single Dataset indexed by time and the x/y coordinate of the Daymet projection (Lambert conformal conic). Since Daymet is quite large, I think it will be easier to use Dataset.sel on Daymet with its projected coordinates, then reproject MODIS to align them. So, I’m creating a proj4 string from the parameters provided in Daymet, and then passing that string to odc.stac.load when loading MODIS. The projection information in Daymet is:

false_easting :
    0.0
false_northing :
    0.0
grid_mapping_name :
    lambert_conformal_conic
inverse_flattening :
    298.257223563
latitude_of_projection_origin :
    42.5
longitude_of_central_meridian :
    -100.0
semi_major_axis :
    6378137.0
standard_parallel :
    [25.0, 60.0]

And the resulting proj4 string is:

+proj=lcc +lat_0=42.5 +lon_0=-100 +lat_1=25 +lat_2=60 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs +type=crs

(note that inverse_flattening and semi_major_axis are the defaults for a WGS84 ellipse)

I’m converting my lat/lon bounding box into Daymet coordinates via pyproj.Transformer and rounding to the nearest 1000m as below.

import odc.stac
import planetary_computer as pc
import pystac_client
import pystac
import pyproj
import fsspec

# Setup view location/size
latitude, longitude = 47.387, -119.232
buffer = 0.25
bbox = [longitude - buffer, latitude - buffer, longitude + buffer, latitude + buffer]

start = "2020-01-01"
end   = "2020-07-01"

daymet_crs = pyproj.CRS.from_proj4("""
+proj=lcc +lat_0=42.5 +lon_0=-100 +lat_1=25 +lat_2=60 +x_0=0 +y_0=0 
+ellps=WGS84 +units=m +no_defs +type=crs
""".replace("\n", " "))

latlon_daymet_transform = pyproj.Transformer.from_crs("EPSG:4326", daymet_crs, always_xy=True)
# Snap to the nearest km
xmin, ymin = tuple(map(lambda x: round(x, -3), latlon_daymet_transform.transform(bbox[0], bbox[1])))
xmax, ymax = tuple(map(lambda x: round(x, -3), latlon_daymet_transform.transform(bbox[2], bbox[3])))

Then I’m slicing both datasets using those bounds. First, Daymet:

catalog = pystac_client.Client.open(
    "https://planetarycomputer.microsoft.com/api/stac/v1",
    modifier=pc.sign_inplace,
)

daymet_collection = pystac.read_file(
    "https://planetarycomputer.microsoft.com/api/stac/v1/collections/daymet-daily-na"
)
asset = daymet_collection.assets['zarr-https']
store = fsspec.get_mapper(asset.href)
daymet = xr.open_zarr(store, **asset.extra_fields["xarray:open_kwargs"])
daymet_subset = ds.sel(
    time=slice(start, end),
    x = slice(xmin, xmax),
    y = slice(ymax, ymin)
)

Then, MODIS:

modis_search = catalog.search(
    collections=["modis-11A1-061"],
    bbox=bbox,
    datetime=[start,end],
    # afternoon overpasses only
    query={"platform": {"eq": "aqua"}}
)

modis_items = modis_search.get_all_items()

modis_data = odc.stac.load(
    modis_items,
    crs=daymet_crs,
    bands=["LST_Day_1km", "QC_Day", "Day_view_time", "Day_view_angl"],
    resolution=1000,
    x=(xmin, xmax),
    y=(ymin, ymax)
)

This all works, but the coordinates are misaligned by 250m.

I’ve tried adjusting the bounding box and modifying the Daymet proj4 string, but to no avail. I know that stackstac has a RasterSpec that can set the bounding box/resolution explicitly, but it only accepts projections with an EPSG code. Daymet’s projection does not have an EPSG code, so I cannot use that approach without reprojecting Daymet (very slow).

I am feeling stuck, and would appreciate any guidance on combining these two datasets.

1 Like

Thanks for the detailed example. IIUC, the problem might be somehow related to how odc.stac / odc.geo constructs its footprint.

odc.stac.load — odc-stac 0.3.5 documentation indicates that using a GeoBox gives you full control over the footprint and resolution. Playing around with that, I see the anchor keyword. With trial and error, I came up with

import odc.geo.geobox

shape = daymet_subset.prcp.shape[1:]
geobox = odc.geo.geobox.GeoBox.from_bbox(daymet_subset.rio.bounds(), crs=daymet_crs, shape=shape, anchor=odc.geo.geobox.AnchorEnum.FLOATING)

modis_data = odc.stac.load(
    modis_items,
    bands=["LST_Day_1km", "QC_Day", "Day_view_time", "Day_view_angl"],
    geobox=geobox,
    chunks={},
)

Note that I constructed my daymet_crs slightly differently, but I think it’s equivalent to yours:

daymet_crs = pyproj.CRS.from_cf(ds.lambert_conformal_conic.attrs)
ds.rio.write_crs(daymet_crs, inplace=True)

With that, this assert passes:

assert daymet_subset.rio.transform() == modis_data.rio.transform()

which perhaps means that it’s properly coregistered. Would you be able to try that out?

Yes, that works! Thanks for the help @TomAugspurger.

I think this example shows some room for improvement on the API for doing the search. Starting with a latitude / longitude (and buffer) and datetime, you had to

  1. Transform the latitude and longitude to the native CRS. Ideally you could slice the Dataset / DataArray with latitude / longitude and a library would take care of the transformation.

    a. We can ensure the Dataset is loaded with CRS information by using decode_coords="all". I have a WIP update to Daymet & its STAC metadata that includes this in the xarray:open_kwargs

    b. It’d be nice if rioxarray provided something like a .rio.slice_latlon, in addition to a slice_xy. The difference is that slice_latlon would automatically reproject the indexers (which are in latitude / longitude) to the CRS of the Datset / DataArray.

  2. I think it could be easier to construct the second dataset with odc.stac.load. Either a GeoBox.from_dataset that’s essentially

geobox = odc.geo.geobox.GeoBox.from_bbox(
    daymet_subset.rio.bounds(), crs=daymet_subset.rio.crs, shape=daymet_subset.rio.shape,
)

or perhaps a like argument to odc.stac.load that you pass a Dataset / DataArray, and odc.stac extracts all the geospatial information from it. Any thoughts on that @kirill.kzb?
3. I got the anchor=AnchorEnum.FLOATING by guess and check. Is that kind of information stored anywhere in the STAC / Zarr metadata?

daymet-modis.ipynb | notebooksharing.space is the updated example I’m working off of.

Best thing is to use geobox= parameter to load exactly the same spatial pixels from a different source. Data loaded with rioxarray is compatible with odc.* libs, so you can access geobox more directly daymet_subset.odc.geobox. You’ll need to import odc.geo.xr for .odc accessor to become available. Even easier you can just supply like=daymet_subset and should get back spatially matching dataset on output.

1 Like