Can a reprojection/change of CRS operation be done lazily using rioXarray?

I added an example of using gdal with zarr (on gcs) to build a vrt for resampling and opening with rioxarray here:

gdalbuildvrt -tr .1 .1 test.vrt "vrt://ZARR:/vsigs/store/group:/variable_1:1?a_ullr=-180,90,180,-90"

Then you can open lazily like

rioxarray.open_rasterio("test.vrt", chunks="auto")

You could likely do the same thing but with gdal warped for reprojection.

I regularly use vrt inside rioxarray.open_rasterio and it supports lazy evaluation pretty well. I have even used gdalwarp to reproject all to many disjoint tiles into a common crs, then used gdalbuildvrt to resample all being done in a lazy fashion with dask and finally open with rioxarray with chunks.

My most common use case is going from COG to Zarr though. I don’t actively reproject/resample zarr this way.

For the case with many many cogs to Zarr, I have found these GDAL env vars are important to tune:

GDAL_HTTP_MULTIPLEX
CPL_VSIL_CURL_CACHE_SIZE
VSI_CACHE
GDAL_CACHEMAX
GDAL_HTTP_VERSION
VSI_CACHE_SIZE
GDAL_DISABLE_READDIR_ON_OPEN
GDAL_HTTP_MERGE_CONSECUTIVE_RANGES

I added a gist here to show how I first build a mosaic for all tifs in a crs (think one UTM zone) using gdalbuildvrt, then reproject resample with gdalwarp for each crs, then finally merge into one large mosaic with gdalbuildvrt and open with rioxarray.open_raster using dask.

3 Likes