Community conventions for automated regression tests for notebooks / cells?

Is there a Pangeo community consensus or convention for how to include automated regression tests for notebooks and/or individual cells? This would seem to become more important as multiple collaborators work on a single notebook together…

(I looked on StackOverflow, which had many suggestions, but no consensus.)

1 Like

I wouldn’t say we have a strong consensus. The most successful example so far is probably in binderbot: https://github.com/pangeo-gallery/binderbot, which executes the notebooks for gallery.pangeo.io. The bulk of the logic is at https://github.com/pangeo-gallery/binderbot/blob/32c96df4f56857bb1c4f7676ef9323a173aed677/binderbot/binderbot.py#L296-L329.

It’s worth emphasizing that these tests are really just executing the notebooks and looking for unhandled exceptions. They aren’t unit tests, and (at the moment) they aren’t run regularly. Those tend to happen in the individual libraries used by pangeo / our users.

1 Like

I’ve started exploring the use of https://github.com/fastai/nbdev, which seems promising. I haven’t used it much yet, but I like the idea of a complete software development workflow with notebooks!

I’d be interested to know the use case/motivation for this. My first thought is that one would definitely want to use plain .py files once tests, linting, version control, and other tooling become important. At some point in a software project, this tooling becomes important, and this is the exact moment where I immediately refactor the useful bits from notebooks into .py files.

Maybe some small GitHub project exists that wraps e.g. pytest, but then you have to learn not only how to use the underlying testing infrastructure, but the mechanics of this custom Jupyter extension too. Oh, and will this custom extension be maintained in 3 years?

For a somewhat light-hearted take on this, I would suggest this talk: https://youtu.be/7jiPeIFXb6U

The specific use case was a repo centered around a python library that implemented a convenience API. To show how convenient, there are some demo notebooks to demonstrate it. I wanted to modify the demo notebook to remove a hard-coded path to the library, but just to make absolutely sure I was not introducing a problem, wanted to add an automated test.

More generally, one of the potentially powerful aspects of Jupyter notebooks is scientists collaborating on science workflows. So if I change something that makes a particular operation run faster (e.g., dask-izing it), it would be nice to automatically test the notebook (or that cell?) to verify the workflow still produces the same answer.

To take it one step further: if I obtained a result with a previous version of a dataset, it would be nice to re-run the workflow (or a particular cell) to see that I got the same (or similar result with a reprocessed version.

@clynnes, you may find nbcelltests extension useful for your use case. nbcelltests requires that (1) tests be stored in the cell metadata, (2) the notebook is executed in a “linear fashion”.

Thanks! The nbval package referred to in the repo Readme might also be useful…