So here is a full example using xarray with apply_ufunc
import xarray as xr
import numpy as np
import pooch
import unpackqa
L8_QA_PIXEL_FILE = 'https://landsat.usgs.gov/sites/default/files/C2_Sample_Data/LC08_L1TP_140041_20130503_20200912_02_T1.zip'
L8_qa_product = 'LANDSAT_8_C2_L2_QAPixel'
file_path = pooch.retrieve(
L8_QA_PIXEL_FILE,
known_hash=None,
processor=pooch.Unzip()
)
my_file = [f for f in file_path if 'QA_PIXEL' in f][0]
ds = xr.open_rasterio(my_file, chunks={'x':2000,'y':2000})
# Drop the band axis since the file has a single band.
ds = ds.isel(band=0)
#-------------------------------
# with unpack_to_array
# resulting in flags as an added dimension
L8_flag_names = unpackqa.list_qa_flags(L8_qa_product)
flag_ds = xr.apply_ufunc(
unpackqa.unpack_to_array,
ds,
kwargs = dict(product=L8_qa_product),
output_core_dims = [['flag']],
dask_gufunc_kwargs = dict(output_sizes={'flag':len(L8_flag_names)}),
output_dtypes=[np.uint8],
vectorize=False,
dask='parallelized'
)
# put labels on the flag coordinates
flag_ds['flag'] = L8_flag_names
# execute all lazy computations and load full dataset into memory.
flag_ds.load()
# Another option would be to have each flag as it's own data variable.
# That can be done with some rearranging.
flag_ds2 = [flag_ds.sel(flag=flag_name).drop('flag').rename(flag_name) for flag_name in L8_flag_names]
flag_ds2 = xr.merge(flag_ds2)
The best way to do it is using unpack_to_array
inside the apply_ufunc
function. Which adds an extra dimension for the flags.
flag_ds
<xarray.DataArray (y: 7351, x: 7551, flag: 12)>
array([[[1, 0, 0, ..., 0, 0, 0],
[1, 0, 0, ..., 0, 0, 0],
[1, 0, 0, ..., 0, 0, 0],
...
[1, 0, 0, ..., 0, 0, 0],
[1, 0, 0, ..., 0, 0, 0],
[1, 0, 0, ..., 0, 0, 0]]], dtype=uint8)
Coordinates:
band int64 1
* y (y) float64 3.144e+06 3.144e+06 3.144e+06 ... 2.924e+06 2.924e+06
* x (x) float64 3.708e+05 3.708e+05 3.709e+05 ... 5.973e+05 5.973e+05
* flag (flag) <U23 'Fill' 'Dilated_Cloud' ... 'Cirrus_Confidence'
Alternatively, it might be nice to have the flags as distinct variables which can be referenced. Which ends up like:
flag_ds2
<xarray.Dataset>
Dimensions: (y: 7351, x: 7551)
Coordinates:
band int64 1
* y (y) float64 3.144e+06 3.144e+06 ... 2.924e+06
* x (x) float64 3.708e+05 3.708e+05 ... 5.973e+05
Data variables:
Fill (y, x) uint8 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
Dilated_Cloud (y, x) uint8 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
Cirrus (y, x) uint8 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
Cloud (y, x) uint8 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
Cloud_Shadow (y, x) uint8 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
Snow (y, x) uint8 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
Clear (y, x) uint8 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
Water (y, x) uint8 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
Cloud_Confidence (y, x) uint8 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
Cloud_Shadow_Confidence (y, x) uint8 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
Snow_Ice_Confidence (y, x) uint8 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
Cirrus_Confidence (y, x) uint8 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0