Loading CMIP5 data in python?

Hi folks. Is there a good way to access data from lots of different CMIP5 models? For example, I’d like to examine additional variables for the models shown at interactive-atlas.ipcc.ch.

To be more specific, I’m analyzing ocean temperature and salinity data in CMIP models. I just finished with CMIP6 models, using the extremely helpful approach here: Accessing data in the cloud — Pangeo / ESGF Cloud Data Working Group documentation. I used the “pangeo-cmip6.csv” googleapis file and the query command to find monthly sos and tos data, loaded them into xarrays, and did my calculations in a loop. This let me calculate specific values from 30+ simulations without the tedious process of downloading data locally. To find the right models, I used Table Atlas.SM.2 in the IPCC AR6 report.

I’d like to repeat these calculations on CMIP5 data. I see that there is a “pangeo-cmip5.csv” file, but it’s much more limited than the CMIP6 version. In particular, the variables are limited to ‘pr’, ‘psl’, ‘rlut’, ‘rsdt’, ‘rsut’, ‘siconc’, ‘tas’, ‘uas’, ‘vas’. I want the ocean variables sos and tos. Is this possible through Pangeo?

If not, is there another good way? Should I be using an ESGF search package like this one: <no title> — ESGF Pyclient 0.3.0 documentation? I know the exact simulations that I want to examine (from Table Atlas.SM.1 of the IPCC AR6 report), but I’m not entirely sure how to load the data remotely. Any tips?

Analyzing CMIP ensembles is a fairly common thing, so I’m curious if I’m overlooking an easy way to handle the data. Thanks!

2 Likes

:wave: Hi there! Pangeo newbie here…

Sorry for unearthing a year-old post but is there an answer to this question?

Is PANGEO/Zarr for CMIP6 data only?