Blog post: Loading NetCDFs in TensorFlow

nbren12 · March 20, 2022, 8:37pm

I wrote a blog post comparing the performance of ML-training pipelines using NetCDF with tf-records and some other file formats. Seems like it might be of interest to some here: Loading NetCDFs in TensorFlow | Noah Brenowitz

What kind of data formats do others in this community using for training ML models?

rabernat · March 21, 2022, 12:37pm

Great post Noah! Thanks so much for sharing.

In our group we use Zarr pretty heavily for ML training datasets. I’d love to see Zarr added to the comparison. If you can align chunks with batches, I imagine it should go pretty fast.

DanRunfola · March 21, 2022, 2:25pm

Thank you for sharing this - very timely. We’ve just started to migrate our workflows over to NetCDF from an old GeoTiff model.

Zarr would be very interesting to see.

Topic		Replies	Views
Favorite way to go from netCDF (&xarray) to torch/TF/Jax et al Data location-ncar , machine-learning	7	5472	August 17, 2022
Best Practice for Machine Learning with Huge Datasets Data machine-learning	1	435	October 26, 2024
Blog post: cloud native data loaders for machine learning using zarr and xarray News & Announcements machine-learning	0	279	March 14, 2024
DL Training Dataset - to Zarr or not to Zarr? Data	2	81	July 17, 2025
Slow Zarr to Netcdf Data	4	649	April 7, 2021

Blog post: Loading NetCDFs in TensorFlow

Related topics