Migration of ocean.pangeo.io User Accounts

Since 2018 we have been operating cloud-based JupyterHubs and BinderHubs for the community. These services were originally just experimental prototypes, designed to push forward the technology for cloud-native big-data geoscience. However, they rapidly gained significant numbers of users for day-to-day actual science. The most widely used example is ocean.pangeo.io, which runs on Google Cloud and has about 50 active daily users. These services have been funded by a direct grant of $100K worth of GCP credits from Google, via NSF’s BIGDATA program. Those funds have now run out, and we need to transition to a new mode of operation. Beyond the raw cost of the compute, there is of course also major human effort required to keep these services operating. Unfortunately, we cannot continue to provide free cloud computing to the world without funding.

Short Term Plan

Some ocean.pangeo.io users will be able to continue to use the service going forward, thanks to support from the Moore foundation. However, this will depend on the alignment of the research with the aims and scope of that funding.

Important: All users who wish to keep using ocean.pangeo.io must fill out this form by Monday, June 22:

ocean.pangeo.io will go down for maintenance on Monday, June 22 and will hopefully come back up later that week with new. Approved users will have their accounts and home directories migrated. Usernames will move from ORCID to GitHub user IDs. Current users who don’t fill out the form will have their accounts deleted.

All users should back up the data from their home directories. Here is an example of how to find all your notebooks and zip them in a tar archive for download.

find . -type f -name '*.ipynb' -print | grep -v '.ipynb_check' > list_of_notebooks
tar -cvf backup.tar -T list_of_notebooks
gzip backup.tar
# download the backup.tar using jupyterlab interface

Other GCP hubs (hub.pangeo.io and hydro.pangeo.io) will be permanently shut down, and all user home directories deleted.

Long Term Plan

We firmly believe that Pangeo-style cloud computing has the potential to transform scientific research, making it more reproducible, transparent, and efficient. We are working hard at finding a sustainable long-term funding model that will permit all scientists to use the cloud in this way. This involves conversations with funders, cloud providers, and universities. To users inconvenienced by this transition, we are deeply sorry. Please stick with us as we prepare for the next phase. We welcome ideas and suggestions from the community on how to best most forward. Feel free to reply below with your thoughts or send me a DM.

2 Likes

In addition to filling out the form above, please continue to report any publications that derive from Pangeo resources here:

I never used a personalised Pangeo account, but only the self-cleaning-after-usage pangeo-notebooks without the need to login. Probably working on this would have been easier with a home account on ocean.pangeo.io.
Will this still be possible in the Future?

Hopefully. We need sustained funding in order to be able to sustain these services in a permanent way. Definitely fill out the form and describe what you want to do with ocean.pangeo.io going forward.

Thanks for the update, Ryan. This is tough timing, we just introduced our summer interns (mostly BIPOC women) to Pangeo as a platform for doing research remotely and collaboratively. I’ll make sure any who need Pangeo to complete their summer project fill out the migration form and will start looking to see if we can get some funding to keep them working on Pangeo. But I wonder if the short duration that they will continue to need Pangeo can be taken into consideration (only ~6 weeks after 6/22).

Hi Marion! We would love to do everything we can to support your internship program. Can you link to a little more info about what you are doing (website, proposal)?

Using our infrastructure to support research among communities of color actually emerged as a top priority from our discussion last week, see this post:

We are not shutting down ocean.pangeo.io. The main point is that, going forward, we will need to be a bit more deliberate about how we direct our resources. Currently, we just run a completely open system which anyone can use for any purpose. That’s obviously not sustainable for the long term. We also need to be more conscious about collecting metrics and documenting outcomes, in order to justify future funding.

Please have the interns fill out the form and mention the internship program in their project description. We will make sure that their accounts are migrated appropriately.

If you want to discuss this in more detail, feel free to send me a DM or email.

Thanks for your work!

1 Like

Part of the challenge here is that, even if you got funding, it would be hard to plug in to ocean.pangeo.io. We really don’t have a mechanism to “bill” different funding sources for different users. This is a technical capability we need to develop asap, but it’s complicated (severely!) by the way university accounting systems work. We are just not set up in our current form to provide services for a fee.

Thanks, Ryan. We appreciate your support on this one. The global pandemic forced us to completely rethink how we would support our interns from a distance and Pangeo was our solution, so it’s really heartening that you’re willing to help us keep trying to make this work.

Here are the specific projects that have been re-designed to utilize Pangeo (note these were the original proposals):
https://cimes.princeton.edu/job-opportunities/intern-program/modeling-mechanisms-coral-thermal-refugia
https://cimes.princeton.edu/job-opportunities/intern-program/validating-tropical-pacific-circulation-gfdl-ocean-models

There is more information about the program on that site as well. I’ll make sure the interns specifically mention the program in the form. And I appreciate you addressing the issue of funding in your other comment.

In terms of documenting outcomes, our interns will give (virtual) institutional seminars at the end of the project discussing their work. I’m happy to report those to the Pangeo Publication Report if you think that is appropriate.

1 Like

Hi Ryan, The CIMES internship program is described at https://cimes.princeton.edu/education-outreach/intern-program. This is the 5th year, and the goal is to broaden participation in science. This year we planned to have 6 interns, but 3 had to be postponed to next year because the projects could not be done remotely without GFDL computing access (during the COVID shutdown interns could not get the necessary security clearance to have GFDL accounts). The remaining 3 projects are being done remotely, and for 2 of them, pangeo looked like the ideal solution. Each intern will give a presentation at the end of the summer, and write a progress report. The funding for the internship program is from NOAA, via GFDL and CIMES (Cooperative Institute for Modeling the Earth System). Let me know if you need any other information.
Sonya

2 Likes

My role is the lead of the internship committee, Associate director of CIMES, and also co-host with Marion of one of the interns.

2 Likes

Thanks @rabernat. As a mentor of one of our summer interns, I can say that the platform was/is critical to us providing a good experience for the group given the complications introduced by COVID-19.

Thanks for all of the great work you do on this service!

Thanks Ryan for your support on this! I am one of the mentors of the two students mentioned by @MarionAlberty and @sonyalegg. Our project had to shift quite a bit due to the lack of GFDL computing access, so we are instead using the CMIP6 archive that your team has put on the cloud. If mentors (1 mentor per student?) could also get Pangeo access during this period to help their students along, that would be really helpful too.

For some users who are not using very heavy computing resources, but mostly for data exploration, is it possible to move their workflow to google colab from ocean.pangeo.io? Has anyone tried this yet, if it is possible?

Definitely possible. See this blog post:

And this colab notebook:

https://colab.research.google.com/drive/19iEVxE_9QoTeg4st7MmucHJUmO93NXHp

1 Like

Hi there, just wondering whether the migration is now complete? And how I can login to the new ocean.pangeo.io? Cheers, Mike

1 Like

We are almost there @byrnem. We hope to announce the new cluster tomorrow. For a preview of what’s coming, check out

1 Like

Hi @rabernat
Unfortunately, I just found out about this post today !
And so I didn’t backup my notebooks on ocean.pangeo.io
Any chance of recovery or is everything wiped out ?
gm

Hi folks. The new cluster is up. All details are descried here: http://pangeo.io/cloud.html

Don’t worry @gmaze, you data are backed up. I’ll send you an email with details on how to get them. If you want to keep using the cluster, please fill out the form anyway.

Awesome ! Thanks !
I’ll fill the form, sure.

Hi. I’m in the same boat. Missed this announcement but would like access to CMIP6 data I worked out a few months ago. I filled out the form including current grants supporting my research.

Best
Axel

1 Like