Is there a write-up about Pangeo's use of Intake?

I’m working through Intake and trying to get a feel for when it would be useful. Is there a write-up somewhere that covers the various ways in which Intake is used in Pangeo deployments?

E.g., it seems that https://github.com/pangeo-data/intake-stac is doing something like “given an input URL to some STAC data, generate an Intake yaml spec and register it on-the-fly”. That’s a bit different from how I had imagined using Intake, but seems like a clever way to write a lightweight wrapper to load in structured data. That said, I am trying to figure out what would be the benefits of Intake here rather than just writing a lightweight I/O package.

Any pointers to docs / code / examples / thoughts / etc would be helpful!

1 Like

No writeup, but there is this repo:

@choldgraf, as I see it, there are two main Intake use paths"

  1. Use a curated Intake Catalog (yaml file) and the Intake-API to provide a high-level data api for know data
  2. Use a Intake Driver (Plugin Directory — Intake documentation) to provide the Intake-API functionality on top of another catalog or data service.

Intake-stac is an example of the former, where we just plug into an existing catalog api. Intake ends up providing the data access machinery in this case.

@rabernat linked to the pangeo-datastore, which is more a manifestation of the first use case. These are datasets we manage and are not already discoverable in some other catalog service.

what would be the benefits of Intake here rather than just writing a lightweight I/O package.

The main thing, IMO, is that we get a standard interface to a bunch of catalog services.

1 Like