This thread is to organize an informal meeting between folks from Pangeo and folks from Globus Labs to discuss potential areas of collaboration and common interests around open science
Globus Labs is a research group led by Prof. Ian Foster and Dr. Kyle Chard that spans the Department of Computer Science at the University of Chicago and the Data Science and Learning Division at Argonne National Laboratory. Our modest goal is to realize a world in which all research data are reliably, rapidly, and securely accessible, discoverable, and usable . To this end, we work on a broad range of research problems in data-intensive computing and research data management.
I have been in contact with both Ian and Ben Blaiszik, who shared that they are beginning to work on some projects in the weather / climate space. This work involves MODIS, CMIP6, and processing large volumes of data on the Argonne supercomputers. Overall I get the impression that our communities share similar aims and values around open science, so I am eager to stimulate some dialog!
A particular project of relevance is Foundry
Foundry is a Python package that simplifies the discovery and usage of machine-learning ready datasets and published models in materials science and chemistry. We provide software tools that make it easy to load datasets and work with them in local or cloud environments and to perform inference using published ML models.
Ben shared these slides about some work that they have done using Foundry in materials science research workflows that are pretty inspiring.
This work has some parallels with Pangeo and Pangeo Forge in particular. From the Pangeo side, we have discussed leveraging Globus’ file transfer technology several times:
- Transfer inputs using Globus · Issue #222 · pangeo-forge/pangeo-forge-recipes · GitHub
- Configure Globus Connect Personal on ocean.pangeo.io · Issue #489 · pangeo-data/pangeo-cloud-federation · GitHub
but have not yet managed to integrate well.
The goal of the meeting is to raise mutual awareness of what the each project is doing and identify possible areas of collaboration. With that in mind, I would suggest an agenda that looks something like this:
- Brief presentations (< 10 min) from each group to introduce the broader aims.
- Deeper dive into specific projects (5 min presentation each)
- Pangeo Forge
- Open discussion (30 min)
If you are interested in participating in such a meeting, please fill out this poll for the week of March 21. (If this week is not good, let me know and we can try something else.)