Pangeo and GeoWeaver

Hello, glad to find here by following the guide by @jhamman . I want to discuss some recent work progress on using GeoWeaver as a proxy to Pangeo environment. GeoWeaver (NASA ACCESS site, GitHub repo) is designed to be a proxy and history recorder of everything done in Jupyter (notebook/hub/lab). It stands between users’ web browser and the jupyter servers and quietly record the history version of notebooks when people click “ctrl+s” or the save button in Jupyter. GeoWeaver maintains a separate database to record all the histories so scientists have control over a complete copy of all their work even the execution environment (cloud) is gone. This is extremely helpful for expensive experiments on the cloud or HPC.

In one sentence, the architecture is going to be like:

user web browsers <-> geoweaver (proxy recorder) <-> jupyter <-> dask cluster

Geoweaver can be deployed anywhere. Right now, my work environment set up is:

Azure
→ Kubernete Cluster (10~110 pods)
→ Dask gateway (load balancer)
→ Jupyter
→ Geoweaver (load balancer)

Geoweaver is not working correctly with JupyterLab yet because some issue in the websocket proxy. We expect it will be solved in the next few weeks. Any advice and comments are really appreciated. Thanks!

2 Likes