We just announced that the CESM Large Ensemble Project is available on AWS as part of the Public Datasets Program. See below for a quick blog post describing the effort to publish this dataset and reproduce some figures from the Kay et al 2015 paper:
I wanted to share this because I thought it would be another useful demonstration of the pangeo tools/workflow in action. We set out to reproduce a few figures from the original BAMS paper and it worked quite well. Here’s a recreation of figure 2:
The example includes:
- Accessing a Intake catalog stored on github
- Deploying a dask cluster on the cloud
- Calculating the area weighted mean to produce timeseries for all 40 ensemble members
- Calculating the linear trend in winter temperatures (using
- Using matplotlib to produce figures that closely resemble the original figures in the BAMS paper
This example is also available as a binder, so you can try yourself. Note that we only reproduced two of the figures so if you’re interested in trying your hand at additional analysis, PRs are welcome.
Thanks to @andersy005 and @kmpaul and the rest of the NCAR science at scale team for their work in pulling this together.
Joe, what hub would you envision the hackathon participants using to access the LENS data in AWS? ocean.pangeo.io is in google cloud. I guess we are not concerned with egress fees, because the data are public.
Right, we can pull out of AWS public datasets for free. We have two JupyterHub’s in AWS-us-west-2 that can be used for more proximate computing solutions. Those are:
Members of the pangeo-data github organization should have access to these hubs. If anyone is not a member of the org and wants to access these hubs, let me know and I’ll get you setup.
Presumably some participants will want to combine data from CMIP6 (in Google Cloud) with LENS (AWS), so some cross-cloud traffic is inevitable.
Hi,Joe. I am not a member of pangeo-data github organization of ICESat-2 and I am from Chinese Academy of Sciences.
Recently I follow the ICESat-2 hack week to learn how to process ICESat-2 data focused on Polar Ice Sheet. But I can not get the data from aws s3. when I used
aws s3 ls s3://pangeo-data-upload-oregon/icesat2/data-access-outputs/
I got this error:
An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
Could you help me to clear this error?
Thanks very much!
Hi @whigg. I am not familiar with the icesat data on s3. Perhaps a new topic is in order since this issue is about the CESM LENS dataset.
Hi,Joe. I want to access icesat2.pangeo.io hub. Could you help me?
Thank you very much for your kindness!
@whigg, the icesat2 cluster has been renamed to https://aws-uswest2.pangeo.io/. For further assistance, please open a new topic as this one is meant to be about the CESM Large Ensemble Analysis notebooks/binder.