Thanks to everyone for this thoughtful and insightful discussion and to @dcherian for pinging me to join in. In an effort to not simply repeat many of the excellent points already made, instead I’d like to share some personal experience and reflections about my own involvement in Pangeo that support them.
I first learned about Pangeo through it’s use at the 2019 ICESat-2 Cryospheric-themed Hackweek. Having participated in an earlier Geohackweek where we had to get everything running on our local machines (using some combination of conda, docker images, and lots of googling), I was immediately impressed with all that Pangeo brought to the table. Upon leading the charge for development of icepyx (see #science:icesat-2) after the ICESat-2 Hackweek, I was welcomed into the Pangeo community, invited to participate in events and share my expertise at Pangeo workshops. Yet despite this warm welcome and these invitations, I still have not actively contributed to the code base that Pangeo relies on. Some of this comes down to time, but as many others have pointed out, it’s also due to not knowing what or where I might be able to contribute (do we have a list of suggested “first time contributor” projects, or would I need to look in the issues tab for each of Pangeo’s repos?). Although I have been writing code for a decade, most of my “open-source developer skills” and knowledge are self taught and have been acquired in the last year (as has most of my actual code sharing on GitHub).
The Pangeo community consists of a wonderful mix of scientists, users, developers, and scientist-developers, but as with any boundary work (a social science term used to broadly describe any work that is being done at the “boundaries” between two disciplines, e.g. a local fisherman’s collective working with policymakers to craft catch laws or historians working with climate scientists to better understand impacts of climate change), there are generally a few basic disciplinary terms and habits that are so “normal” in our fields we no longer notice them (for instance, iteration on PRs as mentioned earlier). Even as a lead developer, I have found myself scared to reject or ask for changes to PRs for fear that people won’t try to contribute again, and from this thread have learned that my own perception of “all PRs as good” is only a perceived reality of what really happens during development for more established libraries.
I think that office hours - with targeted, pre-announced topics (and a clear indication that anyone wanting to join or seek help is welcome), combined with the mentorship, partnered-programming, and an obvious, welcoming list of small projects that newcomers could attempt in an afternoon are all going to be crucial to expanding the number of contributors. Specifically inviting BIPOC to participate in these events through the mentioned programs will allow us to improve the diversity of the Pangeo community, both from within and without.