How has the performance of climate models changed over 30 years of model development?

Scientific Motivation

While the first ocean-atmosphere coupled general circulation models date back to 1969, transient simulations attempting to reproduce the historical record and project future climate changes were not available until the late 1980s. These first-generation GCMs included only atmospheric and oceanic components, were run at nominal resolutions of 3º-10º, and omitted many important sub-grid scale processes whose parameterizations are now common-place. Since then, climate model development has continued, with pushes towards higher resolution, the inclusion of other components of the Earth System, and a flurry of new and/or improved parameterizations. While several studies have documented the improvements in model skill reaped by these model developments, there has been no comprehensive study of climate model skill that spans all the way from the first-generation models of the late 1980s to the state-of-the-art CMIP6 ensemble.

Below: change in climate model performance across CMIP1, CMIP2, and CMIP3 from Reichler et al. 2008

Proposed Hacking

Overall Goal
Compute variable-specific and general model performance metrics (e.g. normalized area-weighted root-mean square; pattern correlations; area-weighted absolute bias; etc) across several model generations, including CMIP6.


Preliminary results
For some context on the old simulations and a preliminary result which compares model skill between the IPCC Second Assessment Report multi-model mean and the CMIP5 multi-model mean, see our recent pre-print at EarthArXiv (rejected but being revised for resubmission).

Below: example skill metric for temperature trends from IPCC Second Assessment Report ensemble and CMIP5 multi-model mean

Anticipated Data Needs

Monthly-mean values of a number of common variables for CMIP6 historical (e.g. 1800–2019) simulations, ~1000 years of control simulations, and 1% per year CO2 runs (if available).

Variables of interest (based on model skill metric in Reichler 2008):

  • sea level pressure
  • air temperature
  • 2-m air temperature
  • zonal and meridional wind
  • precipitation
  • specific and/or relative humidity
  • snow fraction
  • sea ice fraction

Ideally I would have access to ERA5 for all of these variables as well – does this exist somewhere on the PANGEO cloud?

The early generations of climate models (pre-CMIP) had such coarse grids that I only need a few GB somewhere to dump the data for comparing them against CMIP6!

Anticipated Software Tools

Hopefully we would build off of and contribute to existing packages (like esmtools and xskillscore) that leverage xarray and already feature tools for handling model ensembles and computing model performance metrics.

Desired Collaborators

Anyone! In particular, experience with the relevant software tools, model evaluation in general, handling large CMIP ensembles, and / or reanlysis products like ERA5 would be helpful.



I would be keen to get involved with this! Have you already requested the data for CMIP5/CMIP6 for the variables of interest listed above?


Hi Lettie,

Excellent! I will submit the requests for CMIP5/CMIP6 by tomorrow night – thank you for the reminder. I have copies of the output from the first three IPCC Reports (pre-CMIP) locally and will transfer them to Cheyenne / PANGEO once I get an account, I suppose? I’ll start inquiring about CMIP1-CMIP4 (does anyone have any leads?)

Update: I’ve requested the following variables for the 1pctCO2, historical, and piControl experiments: tas (2-m air temperature), ta (air temperature), ua (zonal wind), va (meridional wind), psl (sea level pressure), hus (specific humidity), pr (precipitation) from Amon (atmospheric monthly) and siconc (sea ice concentration / area fraction) from SImon (sea ice monthly). Are those the correct variables?

I suppose we also need these variables from a reanalysis data set like ERA5 or equivalent as a reference for the skill metrics… I’ll look into that as well unless anyone knows where this is on Cheyenne / Pangeo.

Henri, presumably a good chunk of ERA-5 may be available on GLADE or HPSS according to the Research Data Archive. If for some reason that isn’t viable, I have a cloud-optimized version of a pretty large subset on GCP, but only the 10 most recent years. I might be able to provide a few of them.

Great! I don’t think there is a CMIP4 - I think they skipped it to match up the numbers with the Assessment Reports. I might be wrong though.

CMIP3 at least is online through the esgf website. I’ve downloaded a couple of individual CMIP3 files before, but only by clicking on the direct download links. Hopefully someone else knows more



As @darothen pointed out, the ERA5 data set is available on Cheyenne, and it resides in /glade/collections/rda/data/ds633.0/

$ ls /glade/collections/rda/data/ds633.0/      e5.oper.fc.sfc.instan    e5.oper.fc.sfc.minmax  index.html  e5.oper.fc.sfc.accumu  e5.oper.fc.sfc.meanflux  e5.oper.invariant

Thank you for the tips @darothen, @lettie-roach, @andersy005! I’ll look into the Reanalysis and CMIP3 data soon.

@matt-long, it seems that I lost edit permissions to this post / thread. Is there a way I can get edit access back so that I can update my project proposal?