Setting-up a Pangeo hub for my university ? (Teaching)

Dear Pangeo folks,
In January 2021, I’ll be teaching for the 3rd consecutive year a class on big data science for oceanographers.
We usually have between 20 and 30 students. The class is based on xarray/dask and I used to use the ocean.pangeo.io hub to both demonstrate and let students develop their projects.
This was an amazing way to simply let students use a dask cluster in a few clicks.
Given the more restrictive Pangeo hub access policy, I understand that I now need to find a way to still provide this online solution to students.

I do have some funding from our university (UBO/IUEM, Brest, France), so I’d like to ask you what would be the best way to set-up a Pangeo hub for this class ?
and if I need to hire an IT to add a new hub, will the Cloud deployments working group be able to provide some support ?
Thanks for your help !
Guillaume

1 Like

In general I think that making it easy for professors to teach Dask-powered courses is something that we should focus on. This is like the general problem of setting up research hubs, but probably a bit more repeatable. Fortunately I think that there are a few nascent efforts that would be interested in helping here.

<puts on for-profit hat>
I would like to see Coiled.io be helpful in this context. If you’re available for a conversation some time this week I would like to learn more about your needs.
</takes off for-profit hat>

You should also take a look at 2i2c.org, they’re well aligned here (cc @choldgraf) . Maybe also the folks at Saturn could be interested.

In general my guess is that with all of the for-profit and non-profit groups getting into this space it will soon be easy to set up Pangeo hub-like things relatively easily. That is my hope anyway.

1 Like

Undercutting Matt / Coiled slightly (:smile:), I’m working to upstream the pangeo Helm chart to Dask, and will be improving it to make it easier to use in the process. That work is at https://github.com/dask/helm-chart/issues/68.

That said, my work would just make things easier to setup JupyterHub & Dask. You’d still be on the hook to maintain things. Paying money to make that problem go away is perfectly reasonable.

Hey all - thanks for the ping.

I wanna quickly describe what Matt mentioned above -

2i2c is a (quite young) non-profit aimed at making it easy for researchers / educators to get help with pangeo-like deployments in the cloud (e.g. some combination of JupyterHub/dask+xarray+zarr stack/domain-specific customizations and data, etc). We are still in the very early stages of the organization, but perhaps we can discuss what kind of needs you’ve got and figure out if there’s a way to support you either now or in the near future.

I should also cc @rabernat and @yuvipanda so they know about this, as I will soon be entering paternity mode!

1 Like

Hey all and thanks for your answers !

I have funds to hire someone for 1 month, this delay sounds reasonable to me to set this up for a 30 people class !

But I like simple solutions like:

  • 2i2c, although it seems indeed quite young to help us right now,

  • Coiled.io, this looks great @mrocklin, are you still in for a conversation ?

It would be awesome to have support from 2i2c over the long term.
Our needs won’t change anytime soon:

  • something similar to what used to be available at ocean.pangeo.io, with limited access to our students (jupyterhub / dask / xarray)
  • can run on GCP, no need for on-premises (yet)

Sure. Send me an e-mail at mrocklin@coiled.io and we’ll set up a time to chat.

1 Like

Related to this thread – would appreciate any guidance on marrying pangeo-docker-images with stock jupyterhubs:

just shoot you an email about this.