@brian-rose, thanks for tagging me here. Happy to share experiences and brainstorming ideas for making Pythia Cookbooks citable!
I’m interested to hear people’s thoughts about best practices here, and what would make the Cookbook format the most attractive to potential content authors.
I think unique identifiers or persistent identifiers for digital objects are essential in collaborative and live research objects as Pythia Cookbooks. DOIs will make easier for others to cite them, reduce the risk of broken links and facilitates tracking how they have been used and cited.
I see that the Environmental Data Science folks have elegantly solved this problem in two different ways, with a Zenodo-based DOI for the source repo, plus individual ROHub-based citations for each notebook. Shamelessly tagging @acocac for comment on this!
We indeed use a Zenodo-based DOI generated through a third-party integration between GitHub and Zenodo for the EDS book source repo. It allows versioning major changes in the Jupyter book and some miscellaneous scripts. We follow the Turing Way’s workflow for releasing different version (see here). Note TTW folks nicely automatised their release workflow (see here).
We also considered generating DOIs for notebooks through the same integration as EarthCube notebooks do. However, we’ve found Zenodo poorly handles some relevant metadata of our notebooks, such as geographical location, bibliography, input and outputs. Thanks to Anne Fouilloux @annefou, we started exploring RoHub, a Research Object management platform that enables researchers to collaboratively manage, share and preserve their research work (data, software, workflows, models, presentations, videos, articles, etc.).
Let me describe a recent example how RoHub facilitates preserving executable research objects and potentially incentivise people to publish EDS book notebooks.
For the notebook repository, GitHub - eds-book-gallery/b128b282-dee7-44a7-bc21-f1fd21452a83: Exploring Land Cover Data, we used RoHub to register it and add how to cite using a W3ID permanent identifier, https://w3id.org/ro-id/b128b282-dee7-44a7-bc21-f1fd21452a83. The right panel in the figure below shows some stats and numbers of resources, annotations, events, etc. Please note it also indicates snapshots, forks, archives similar to GitHub repos. RoHub stats are very valuable and complementary to others provided by GitHub and Google Analytics associated to the notebook repository and EDS book website, respectively.
We can retrieve Zenodo-based DOIs through a third-party integration between RoHub and Zenodo (see for instance the snapshot, ROHub). Ideally, the rendered version of notebooks should use this Zenodo-based DOI instead of W3ID. While the citation using W3ID includes authors and reviewers, Zenodo-based DOI only mentions the former group.
Regarding the impact of citation, we noted the notebook author added it to his online CV (see Book Chapters in CV | James Millington). Additional to the citation, inspired by OSF Badges to Acknowledge Open Practices, we’re thinking to develop custom ones for existing and future contributions in the EDS book.
It’s worth mentioning we haven’t exploited the full capabilities of the RoHub platform and Research Objects (ROs) as living resources. At the moment, we’ve created RoHub ROs for all EDS book notebooks at the post-print stage (before their publication). ROs encompass research outputs created, revised and shared throughout the research lifecycle. This means we could create a RO for EDS book notebooks since their inception (we capture this through notebook ideas issues, see [NBI] Exploring Land Cover Data · Issue #99 · alan-turing-institute/environmental-ds-book · GitHub).
As part of last year TTW book dashes, Anne and I published a dedicated section to introduce Research Objects, Research Object to capture the Research Life Cycle — The Turing Way. Feel free to navigate it.
Hope the above description of how citation works in EDS book and notebooks are valuable for the Pythia Cookbooks discussion with the community. @annefou can provide further ideas. She recently introduced RoHub in ESIP23 (https://www.youtube.com/watch?v=vFS2oAk4R-I) also registered in ROHub.
Looking forward to hearing others opinions!