This topic is for discussing the CMIP6 collection and the various resources for identifying and requesting data for your projects.
Hi all, I am Naomi Henderson (research scientist, LDEO) and I have been putting together our zarr CMIP6 bucket in Google Cloud Storage (GCS). I will get this started by giving a few general tips which should apply to both the NCAR/GLADE and the LDEO/GCS collections.
1. Finding the data you need:
-
Identify your project and decide which variables and experiments you will need
-
Check availability at: ESGF
-
CMIP6 Data for beginners: Google Doc
-
CMIP6
table_id
is a code for specifying realm and frequency, see: HERE -
CMIP6 experiments (
experiment_id
):- List of: Tier 1 Experiments
- List of: Tier 2 Experiments
- List of: Tier 3 Experiments
- List of: Tier 4 Experiments
-
CMIP6 models (
source_id
):- List of: Models
-
CMIP6 variables (
variable_id
):- List of: Variables
-
Check the CMIP ERRATA page
2. Make a request: HERE
-
A data request can be made with a minimum of 3 keywords:
experiment_id
,table_id
, andvariable_id
. -
The Google Form which we have put together allows you to specify first a
table_id
and then select multipleexperiment_id
s and provide a list ofvariable_id
s. -
Normally, we would assume you would like all available models, but if you would like specific models, you can select
source_id
s from a list. -
Please add comments and questions if the simple form does not quite fit your requirements.
Thanks @naomi for this very useful post!
Thanks, Naomi for the post and collecting CMIP6 data. I am wondering if it is possible to get Observed/reanalysis data for the Hackathon. Era5 will be great as reanalysis data. I am also looking for CMIP5 data to use in the hackathon. However, I wasn’t able to request the CMIP5 data.
When I make a data request, should I include all the variables I want, or just the ones not listed in the “Preliminary contents of our GCS (gs://pangeo-cmip6) bucket” section of the google doc?
Hi @cspencerjones , it is probably better to list all the variables you want - so that the scripts know to automatically update as more datasets become available at ESGF. I am not sure if this is true for the NCAR/GLADE folks, but it is for the LDEO GCS collection. Thanks for the question!
Thanks all for the valuable resource. Once we’ve made a request, is there any way of knowing if it is being actioned, or being notified when it is?
Hi @AndMei , normally it has been taking just a day or two for requests to be filled, so that feature was not so important, perhaps! The LLNL downloads were not happening for a few days, and now the internet seems very slow (gamers? netflix?). But I should have gotten back to you sooner, sorry. I will email you separately with an update.
It is a great idea to build in automatic updates, though. Would you be willing to open an issue here https://github.com/naomi-henderson/cmip6collect?