Intake Note Book: Access Notebooks from a intake catalogue and executing them using Paper Mill

NickMortimer · April 17, 2020, 7:36am

In November last year I visited the Met Office with @kaedonkers. We spent some time looking at scientific workflows and parameterizing notebooks. To this end, we came up the concept of a set of parameterised notebooks accessible via an intake catalogue and then their output held in a catalogue.

The way it works at the moment is I have two drivers,

Notebook : this provides access to the notebooks, so a ymal catalogue can be made like:

sources:
enso:
args:
urlpath:

“{{ CATALOG_DIR }}/experiment/calculate_enso.ipynb”
“{{ CATALOG_DIR }}/experiment/calculate_enso_clim.ipynb”
description: ‘’
driver: intake_notebook.notebook_source.NotebookSource
metadata: {}

calculate_enso_clim.ipynb has a cell:

#parameters 
''' calcaulte the climatology for ENSO area given a start and enddate
    Parameters
    ----------
    catfile : file path to intake catalog that has a variable called sst
    startdate : date of first point to average (must be string e.g. "1974-1-1")
    enddate : dat of last point to average (must be string e.g. "1984-12-31")
'''
catfile='c:/data/providance/sst.yml'
startdate = "1974-1-1"
enddate = "1984-12-31"

this is read and monkey patched by the driver to form an execute function

data_cat =os.path.abspath("C:/data/providance/sst.yml")
books = intake.open_catalog('C:/data/providance/notebook.yml')
nb =books.enso.read()
nb.calculate_enso_clim.execute(startdate='1960-1-1',enddate='2020-1-1',catfile=data_cat)

Executing the notebook causes a new directory to be made in the current directory with a uuid number. the resultant notebook output and parameters are sored there.

Next step I can load all the output as a catalogue using my experiment driver

cat =intake.open_experiment('.')
cat.get_params()

that returns all the parameters used in a pandas data frame, it also looks for data outputs and concatenates them together

This was just a an experiment and I’m keen to hear people’s thoughts about it!

Topic		Replies	Views
Sep 27, 2023: "Intake 2: The Future", Martin Durant Pangeo Showcase	10	852	October 4, 2023
Is there a write-up about Pangeo's use of Intake? Data	2	775	October 25, 2019
Statement of Need: Integrating JupyterBook and JupyterHubs via CI Cloud	17	1919	August 25, 2023
Pangeo Showcase: "Intake v2: The Future", Martin Durant, Anaconda News & Announcements	2	672	September 28, 2023
Call For Notebooks 2022 EarthCube Annual Meeting News & Announcements	0	375	February 24, 2022

Intake Note Book: Access Notebooks from a intake catalogue and executing them using Paper Mill

Related topics