Next steps for Pangeo / PyOpenSci Collaboration

TLDR: we need volunteers to help define standards for Pangeo-affiliated software packages and volunteer to submit / review packages for PyOpenSci!


Background and Context

@lwasser gave this great Pangeo Showcase talk a few weeks back

Now that PyOpenSci is on its feet, it seems like a good time to revisit the conversation that started on their issue tracker two years ago:

What Now?

Leah and her team are funded by Sloan foundation to collaborate with us on this idea. However, the Pangeo community needs to step up if we are going to take advantage of this opportunity. Specifically we need to revisit and update the out-dated Package Guidelines on our website. We collectively need to answer:

  • What are the best practices we want to establish for Pangeo-affiliated packages?
  • How much of this is covered by the PyOpenSci Peer Review Guide? What are the extra criteria that apply to Pangeo packages that might not necessarily apply to general packages?
  • Is there an overall vision or framework we want to use to organize and classify the packages in our ecosystem?
  • How can we turn these standards into actionable review and assessment criteria for PyOpenSci

Once those are established, I think we can start putting packages through the review process. Some questions that come to mind here are:

  • Where do we start? How do we choose what packages to prioritize?
  • Who will do the work of submitting the packages?
  • Who will do the work of performing the reviews?

All of this sounds like quite a bit of work, none of which we are explicitly funded to do. If folks are interested in working on these issues on a contract basis, I think we could find some funding at Columbia to support that. Regardless of the funding question, I feel that moving this forward will require some deep engagement from people who have been with Pangeo for a while and understand the vision and ecosystem. It can’t be outsourced.

I can commit to helping with, but not spearheading, this effort. Who else is enthusiastic about this and would like to help lead it?

2 Likes

hey @rabernat I am so excited about working with Pangeo. In my mind the standards part is the most important as it will drive the peer review. In Pangeo are packages already maintained by people individually? if that were the case then i don’t think the community needs to decide what to submit. People can submit and the lead editor who covers pangeo packages can then evaluate potentially. But then again I don’t know what your package base looks like now so I could be off in my thinking about how this could work.

I’d love to hear from others in the community. What we’d be providing here is a curated list of reviewed packages that adhere to our standards (and yours). We’d do most of the review work but would want an editor on our editorial board to handle decisions around incoming packages. you’d have the support of our team however in that effort!

All - looking forward to hearing from you.

UPDATE: also y’all we find reviewers. you don’t need to take on the burden of also doing all of the reviewing. let’s talk more. we have infrastructure to support some of this!

1 Like

I can volunteer to help out from the pangeo side. I agree that the main work item here is updating our Package Guidelines.

If folks are OK with me working on this, I’ll focus on deduplicating our guide with what’s covered by PyOpenSci’s guides, focusing on the bits that are pangeo-specific (where pangeo will roughly mean “geoscience”, maybe “scalable geoscience” in this context).

Is there an overall vision or framework we want to use to organize and classify the packages in our ecosystem?

I think this is worth a discussion (which we can do asynchronously here). I’ll make another post later with my thoughts.

From there, I’ll probably follow @lwasser’s advice and just help out with editorial duties, rather than trying to submit various packages through the process.

2 Likes

@TomAugspurger !! thank you so much for replying and my apologies that I didn’t see the reply. i need to adjust my notification settings here. That would be AWESOME if you can help lead this.

I this can be a simpler workflow to ensure that we are maximizing the use of people’s time and resources. because pyopensci can do the work of getting the reviews going and inviting packages even to submit. we can also work with you or someone from pangeo to identify overlap and duplication. i am also working on our packaging guides now so it may be helpful for us to stay on touch on that topic. If you’d like to also join our slack to check in with us directly that would be great. you can email me leah at pyopensci.org

What we might consider doing is use that dated list of packages that you had as a start and we can actually invite maintainers to submit to us as an option.

Then in terms of classification - can you tell me what the goal of classification would be so i better understand? i’m totally up for helping you through this process as part of this is figuring out what processes are most useful for communities like pangeo.