On 2023-12-06 several folks met and discussed "What’s next for Pangeo” organized by @TomNicholas . This article collects links related to the Cloud infrastructure discussion in other sub-topics below. I propose that people engage asynchronously on those topics for a bit, and based on the activity we decide what to discuss synchronously.
Pangeo makes cloud computing accessible to scientific researchers. We’ve had some success; many geoscience users have moved to the cloud.
However, this problem is hard, it requires us to simultaneously solve problems of …
- User experience: What’s easy enough and powerful enough? Which workflows can we support?
- Maintenance: How hard/easy is it to maintain this system? Who does that work?
- Money: Who pays for the cloud resources? Who pays for the personnel to maintain the systems?
We’ve gone through some history here, ranging from a single globally publicly available hub, to many smaller hubs for different projects, to a variety of entities (commercial and non-commercial) offering services.
Probably there won’t be a single solution to all of our problems
Probably there are many incremental things we can do to improve the situation all around.
In live discussion yesterday a few ideas were proposed. I’m listing them below:
- Coiled account managed by Pangeo, with a lightweight process to add users
- Revitalize 2i2c-backed hub
- Provide public storage
- Popularize Nebari
- JupyterLite / Client-side
There is also a broad theme here that I’ll broadly name “Build vs Buy” or “Commercial vs DIY”
I don’t yet want to make separate discussions for each of these topics, but I also don’t want to talk about all of them here in this discussion (it seems to be too much to handle all at once). Instead, I’m going to pull things out into a few different discussion topics (please anyone feel free to override / make new topics instead)
- Pangeo-managed infrastructure. What can/should Pangeo manage itself?
- Partner-managed (2I2C, Coiled, Nebari, Earthmover, …)