A pangeo education working group?

From @rabernat on Tue Mar 19 2019 15:10:55 GMT+0000 (UTC)

It has come up at many recent weekly checkin meetings that we need to be putting more effort into education and outreach. The tools we are building have been moving very fast; in order to have the biggest impact, we need to find a more coordinated way to bring the broader community along.

Perhaps we need an education working group?

@kmpaul - any thoughts on how this might align with NCAR’s ongoing efforts, NCL transition, etc?

Any volunteers to lead such a working group?

Copied from original issue: https://github.com/pangeo-data/pangeo/issues/575

From @rabernat on Tue Mar 19 2019 15:12:45 GMT+0000 (UTC)

Was discussed in https://github.com/pangeo-data/pangeo/issues/518, where @ktyle and @brian-rose also expressed interest.

From @rabernat on Tue Mar 19 2019 15:15:43 GMT+0000 (UTC)

Also discussed extensively in https://github.com/pangeo-data/pangeo/issues/411 (this issue is basically a duplicate of that), where @DamienIrving, @robfatland, @amanda-tan, @marylhaley and @venitahagerty all expressed enthusiasm.

What is needed is a way to focus and coordinate the disparate efforts going on here.

From @kmpaul on Tue Mar 19 2019 15:28:36 GMT+0000 (UTC)

This is great! I’m entirely in favor of an education working group.

At NCAR, we have a few irons in the fire, but they definitely need more work. We are developing a larger tutorial (like ~2 days) for later this summer that will be targeted primarily to NCAR/Cheyenne users. And we have small tutorials that we are putting together for smaller outreach efforts to research groups here at NCAR that are (primarily) concerned about what happens in light of the NCL Pivot to Python. These tutorials are looking mostly like short sessions designed to answer simple questions like “How do I do X in Python, when I used to do it easily with NCL?”

All of this has been ad hoc up to this point, but I think it needs to build out. I would very much like to work with people on this. In addition to the people already mentioned, @lheagy had some excellent ideas.

From @rabernat on Tue Mar 19 2019 15:31:35 GMT+0000 (UTC)

Thanks for the reply Kevin. It’s clear there are lots of good pangeo-related educational materials out there, and I won’t try to enumerate them all right now. What I think is needed is someone to take an effort to catalog these for our website, keep them up to date, and identify what new materials need to be generated. A particular gap is related to what I’ll call “cloud-native” data analysis. There are lots of guides about how to use xarray, metpy, etc… But there is no guide that explains how to use a pangeo cloud-based jupyterhub, interact with cloud storage, scale up dask clusters, etc.

From @kmpaul on Tue Mar 19 2019 15:35:35 GMT+0000 (UTC)

Ah! Yes. That’s a very good point. Ana Privette at AWS has expressed an interest in developing a “Pangeo on AWS” tutorial that might be a perfect platform for exactly that.

From @kmpaul on Tue Mar 19 2019 15:38:56 GMT+0000 (UTC)

…Should this start by creating an “Educational Materials” section on pangeo.io? Or should this content go into the existing “Guide for Scientists” section?

It would be very cool to binderize some of our existing tutorials.

From @jmunroe on Tue Mar 19 2019 16:02:59 GMT+0000 (UTC)

I am currently preparing for delivering a day long set of training (as part
of C3DIS, Canberra, May 2019) on Pangeo. The anticipated plan is to use AWS
for the deployment. I will try and incorporate some cloud-native material
with this issue in mind.

From @rabernat on Tue Mar 19 2019 16:06:06 GMT+0000 (UTC)

FWIW, I continuously receive requests for training in “pangeo.” If we were able to offer some sort of software-carpentry-style workshop that was just ready to go, even if people had to pay to support the costs, I think there would be lots of interest from institutions.

Currently none of our funding sources has a budget that permits us to offer such training on demand.

From @amanda-tan on Tue Mar 19 2019 16:12:04 GMT+0000 (UTC)

I think as part of the Pangeo ACCESS proposal, we had talks about developing SC-style building blocks for on-boarding a more general audience. It might behoove us to work together with the eScience crowd in developing some of these tutorials in conjunction with the hackweeks. +@jhamman

@kmpaul I would be interested in discussing putting in a proposal to the next round of the NSF CyberTraining RFP (due in Jan. 2020) but never too early to start.

From @kmpaul on Tue Mar 19 2019 16:15:49 GMT+0000 (UTC)

@amanda-tan Yes! I’d love that! We started thinking about a CyberTraining proposal this year, but the timeframe was too tight. I think that we all agreed we’d (@brian-rose, @ktyle, @lheagy, @jhamman) like to pick that up for the next round.

From @lheagy on Tue Mar 19 2019 16:22:09 GMT+0000 (UTC)

Would creating a github repo with just a readme where we can collect a list of pangeo educational material be a useful starting point? I think to @rabernat’s comment, first just getting a picture of what material has already been generated would provide some clarity on where to go next.

From @robfatland on Tue Mar 19 2019 16:44:17 GMT+0000 (UTC)

Looks great, +1. I’m really interested in three related questions

  • Change potential: How many humans could make use of the Python geoscience stack (who don’t)?
  • What are the barriers to this happening? i.e. are there addressable barriers?
  • How do we devote effort to EO to lower whatever barriers those are?

As a colleague of mine points out: It is not necessarily ‘learning pangeo’ that is the central challenge. My recent struggles with getting the hang of xarray incline me towards a path from Python to numpy to pandas to <useful ancillary packages> to xarray to dask to <building my own package>; all
just in order to be ready to make productive use of pangeo.

I also can’t agree enough with learning it ‘well’ (e.g. the SC ideas) so as to have the
right framework for the inevitable developments to come.

I’ll mention some notebook repos that would be candidates for @lheagy 's idea.
These are mostly oceanography with a bit of glaciology over in one corner.

From @dopplershift on Tue Mar 19 2019 18:20:50 GMT+0000 (UTC)

We at Unidata are happy to help out where we can. We’ve been teaching our python workshop with regularity. It features a few pangeo components (though not dask yet), but is largely focused on meteorology.

From @brian-rose on Wed Mar 20 2019 01:54:46 GMT+0000 (UTC)

+1 for some community organization around educational materials.

I have a little bit of xarray and metpy stuff integrated into my climate modeling lecture notes but would like to dig deeper.

The discussion in #518 led to some great ideas for an NSF CyberTraining proposal. I am definitely game to pick this back up for the January 2020 call.

From @jwagemann on Wed Mar 20 2019 12:53:35 GMT+0000 (UTC)

I am just about to start doing some tests with Copernicus open data from ECMWF, but I am happyt to contribute to training material if needed.

From @kmpaul on Wed Mar 20 2019 13:44:39 GMT+0000 (UTC)

This is actually a really exciting response from everyone! Thanks, @rabernat, for starting this thread.

There have been so many great replies with excellent material, I went ahead and implemented @lheagy’s suggestion and created pangeo-data/education-material as a starting point. I culled the list above and added links and descriptions to the material already provided in this thread.

I think we need to identify the topics that we want for education material and organize this material into those topics, so we can see where we are light and/or heavy on material.

From @mrocklin on Wed Mar 20 2019 16:15:48 GMT+0000 (UTC)

Putting on my for-profit hat, trainings are also something that companies frequently request. I would not be surprised if a company like Anaconda (cc @jbednar) or QuanSight (cc @scopatz) would be interested in getting involved.

From @jbednar on Sat Mar 23 2019 20:38:25 GMT+0000 (UTC)

We at Anaconda have created the earthml.pyviz.org site, which we’ve given as a day-long tutorial at NASA Goddard as part of our project with them. It focuses mainly on viz tools and on preparing data for ML tools, and doesn’t cover JupyterHub or distributed computation. We’d be happy to prepare and maintain additional publically available training materials as part of our Pangeo or NASA collaborations, but as a company Anaconda is not currently in the business of selling general Python training or domain-specific training. We’d be happy to help advise Quansight or anyone else if they want to train using our materials.

From @daxsoule on Fri Mar 29 2019 11:46:51 GMT+0000 (UTC)

+1 for educational materials. I am building my third “generation” of research students at Queens College and each year we do a little better at curating the materials and identifying a pathway that help them go from zero-research. I think this will be very helpful.