Seeking advice on file format for an API

ags_pangeo · February 10, 2025, 3:53pm

I’m building out an internal API that will serve temperature data at various altitudes on a predetermined grid (x deg lat x y deg long x z km alt). I have data at cadence of n minutes for this grid.

I’ve worked with this kind of data - what would be an ideal format to store such data, such that I can use metadata to pull the exact partition that the latitude, longitude, altitude, and timestamp can be found?

I considered GRIB2 files - partitioned by timestamp and altitude, resulting in small files with temp_timestamp_altitude.grib as names… the API backend that is able to quickly query AWS s3 for the right file. However, this results in 1440/n timestamps * y altitudes files, which is quite clunky.

Would someone here have recommendations for a cloud optimized format that can be used to back a sub-500ms response time API?

NOTE: eventually, we’ll extend the API to accept lat/long/alt values that do not sit exactly on the grid points, so we’ll find the closest and then interpolate values.

Maybe postgis?

rabernat · February 10, 2025, 5:19pm

Welcome @ags_pangeo!

I think the formula that our community has converged around for optimal performance and flexibility is Zarr. Combined with XPublish’s EDR plugin, this gives you a great architecture to achieve your needs. In fact, @jhamman and Alex Kerney recently gave a presentation on this exact subject at the Pangeo showcase.

If you’re interested in a hosted solution, rather than something you have to build and manage yourself, our company Earthmover offers a subscription-based managed service using this type of architecture.

ags_pangeo · February 10, 2025, 5:20pm

That’s pretty neat! I’ll take a look at all of the links (including the hosted solution) today.

Topic		Replies	Views
OPeNDAP vs. direct file access Data	32	4444	January 27, 2021
Cloud array storage solutions Data	3	1199	November 29, 2023
Welcome, I need some support for the design of a forecast archive with Zarr Data	10	1179	April 23, 2022
Formatting Radio Occultation Data for the Cloud! Data	1	479	May 11, 2021
What's the best file format to chose for raster imagery and masks products Data	23	1679	May 18, 2025

Seeking advice on file format for an API

Related topics