Reading HDF-EOS (HDF4) files in parallel: GDAL/rasterio/etc

Note that building the VRT can’t be done by name (unless using the fully expanded subdataset DSN), it has to be ‘-sd ’ which is the 1-based index of the the variable, here 8 for ‘Nadir_Reflectance_Band1’.

gdalbuildvrt -sd 8 Nadir_Reflectance_Band1.vrt e4ftl01.cr.usgs.gov/MOTA/MCD43A4.061/2001.01.01/*.hdf

Then open_dataset can load:

xr.open_dataset("Nadir_Reflectance_Band1.vrt")

<xarray.Dataset> Size: 13GB
Dimensions:      (band: 1, x: 86400, y: 38400)
Coordinates:
  * band         (band) int64 8B 1
  * x            (x) float64 691kB -2.001e+07 -2.001e+07 ... 2.001e+07 2.001e+07
  * y            (y) float64 307kB 7.783e+06 7.783e+06 ... -1.001e+07 -1.001e+07
    spatial_ref  int64 8B ...
Data variables:
    band_data    (band, y, x) float32 13GB ...

I would more likely warp that VRT to the target grid, then open it (I don’t know if there’s ways to template adding the other variables, other than loading each DataArray and putting them together).

Also I’m not across other methods of mosaicing for xarray, so won’t speculate - I appreciate this example it’s relevant to avenues I’m exploring (and I’ve setup a feature request to be able to name ‘-sd’ for those simple cases). Another option in future GDAL will be to index these with the GTI driver.

It’s unfortunate to be spread over so many files, and really wasteful to have the entire sinusoidal grid materialized so don’t do that - but I guess it reflects that the entire set can’t be stored sparsely in HDF4(?) and the project adopted sinusoidal from the ragged-array L3bin scheme from SeaWiFS afaik.

One option to warp to materalized form is (just guessing at a sensible resolution, using sparse blocks and compress gives a 1.9Gb tif):

gdalwarp Nadir_Reflectance_Band1.vrt Nadir_Reflectance_Band1.tif -multi -wo NUM_THREADS=ALL_CPUS -t_srs EPSG:4326 -tr 0.004 0.004 -te -180 -90 180 90  -co COMPRESS=LZW -co SPARSE_OK=YES -of COG

##Creating output file that is 90000P x 45000L.

But, set a vrt target rather than tif and then it will be a text description of the warp job, and it will only materialize as needed after load.