Raster Attribute Tables

dcherian · April 29, 2025, 3:49pm

I’ve recently discovered that “Raster Attribute Tables” can store some very rich metadata.

As an example the GLAD LULC dataset has this rich classification scheme (warning excel sheet) that is hierarchical. Each class is associated with an integer code and a color code.

This was not easy to parse (notebook)!

A second example is in this vignette.

I have a few questions:

How do people work with such classifications in python today (outside of GDAL bindings)? Is there a data structure that can represent this hierarchy nicely? My first thought was pandas multi-index but that isn’t exactly right.
What analytic workloads do you use such classifications for? Zonal stats at each classification level (Theme, general class, sub-class) seems likely, but also painful at the moment, without some kind of helper.
Has anyone explored storing this in Zarr? AFAICT in the GeoTIFF world, this stuff is in a sidecar xml file (are there conventions for this?). IIUC the CF conventions can’t really represent this level of detail for classifications.

TomNicholas · April 30, 2025, 5:57am

@mdsumner I’m sure you have thoughts

Michael_Sumner · May 2, 2025, 2:53am

I store them as a table with columns, indexed by pixel value. I see it as levels of grouping, so each unique pixel value has a category in 1 or more columns. (Simple nesting, so it’s equivalent to GROUP BY, I don’t know how that’s thought about in Python).

I don’t know any other way that doesn’t use GDAL , but last time I used one I read the RAT out of the aux.xml directly (because the people I worked with had GeoTIFF with auxiliary RAT metadata and weren’t GDAL users). Otherwise, in R I do

library(terra)
r <- rast("thefile.tif") ## GeoTIFF can't store RAT, but could be in a sidecar .aux.xml
levels(r) ## will give (a list with) the RAT table.

I could share documentation I sent to colleagues recently, but I can’t post that publically without anonymizing it a bit. (There was a disconnect from the real RAT with a table of values in Excel, they had introduced spaces in the class values I think, which is exactly like the very untidy excel defintion Deepak show above, that’s way messier than what I had to deal with and is clearly aimed at human eyeballs).

Happy to hit up RATs with python if that’s helpful, I’ll check if the dataset we looked at is public yet.

Michael_Sumner · May 2, 2025, 3:07am

Here’s the map, click on any pixel to see two levels of classification

these groupings used to classify the area of habitat (and its change over time)

The practicality was the team had an Excel spreadsheet that was out of sync with the more formal RAT stored in XML as a sidecar (in GDAL form), and the GIS team had to reconcile those with how it was uploaded into the corporate software. It’s an interesting road bump for something that really is only a simple look up table (so imo is complicated by incompatible concepts and software a lot more than necessary unfortunately).

Topic		Replies	Views
Tables, (x)arrays, and rasters¶	18	2917	November 15, 2022
Hierarchy between open source libraries addressing raster topic Meta	4	147	December 5, 2024
Nodatavals attribute in geoTIFFs and xarray Data	4	1805	April 17, 2021
Wednesday February 1st: Xarray-Datatree: Hierarchical Data Structures for Multi-Model Science Pangeo Showcase	0	570	February 27, 2023
First 2023 Pangeo showcase at the Feb 1 community meeting! News & Announcements	1	1038	January 27, 2023

Raster Attribute Tables

Related topics