Statement of Need: Integrating JupyterBook and JupyterHubs via CI

Hi all,

Circling back to this important topic: Pythia Cookbooks now have a solution for the “expensive, reproducible notebook” problem.

The latest Cookbook template supports a new option execute_notebook: binder in the JupyterBook _config.yml file.

Setting that option will outsource the execution of notebooks to the BinderHub URL specific in the same _config.yml file. This is done via GitHub Actions and a call to Binderbot under the hood.

You can see this in action for the HRRR-AWS Cookbook where we set the execute_notebook: binder option and the build is being tested nightly.

Cookbook authors can now easily toggle back and forth between executing notebooks on a BinderHub and executing on GitHub Actions just by changing the execute_notebook option, e.g. this demo pull request.

Basically the Cookbook template now offers a full platform for collaborative authoring, execution, and publishing of reproducible content, so long as a suitable BinderHub service is available.

(Our template currently points to @ktyle’s temporary proof-of-concept BinderHub on jetstream2 – we don’t expect that to be a stable resource. But changing the URL of the BinderHub is a one-liner in the _config.yml file. It need not be a one-Binder-to-rule-them-all situation)

I’m pretty excited about the possibilities here!

While this all works for now, putting it together was a bit of a hack and involves overloading an existing config field for JupyterBook. Also the Binderbot package needs maintenance – for example it breaks under Python 3.11.

I’m curious to hear some feedback from the community at this point:

  • Are we heading in a useful direction?
  • Should we think about migrating the Binderbot-based solution upstream into JupyterBook itself?
  • Is Binderbot the right technology, and is there interest in updating and maintaining it?

Shamelessly tagging folks who might have opinions about all this: @choldgraf @rabernat @jbednar @scottyhq @yuvipanda

6 Likes