Pangeo OpenEO Backend?

openEO develops an open API to connect R, Python, JavaScript and other clients to big Earth observation cloud back-ends in a simple and unified way.

https://openeo.org/

This project has quite a bit of traction, particularly in Europe. It would be great to try to connect Pangeo with OpenEO. I think the path forward on this would be to implement–or contribute to–an OpenEO “backend”, which can process and serve data to users via an API.

The API is specified here:

There seems to be a python backend already started here:

This uses some xarray and dask, but I’m not sure it takes full advantage of the Pangeo cloud stack (zarr, dask distributed, xarray, etc.).

I am posting this issue to gather thoughts about how we could work with OpenEO. Is anyone in our community also involved in OpenEO? What would be the best way for us to have an impact?

3 Likes

Sounds like an intake-able API.
I didn’t follow, from a cursory reading, how or where processing happens. I wonder at yet another graph-like representation of computations (otoh: people do not serialise xarray/dask pipelines for recall).

Hi Ryan,

I just want to quickly let you know that we plan to let students write an openEO Pangeo back-end this winter term. That will likely not be production ready in any form, but would give us a first impression to base further work on. Any use case you would find interesting or anything you are particularly interested in regarding pangeo/openEO?

There are several of Python back-ends, but openeo-processes-python is just implementing the processes itself, not the HTTP API. I don’t think the openeo-processes-python take full advantage of pangeo yet, but that could also be explored in the future and any help is highly appreciated.

Best,
Matthias

2 Likes

That’s awesome news @m-mohr! We would love to help out on that project however we can, e.g. helping to define best practices or debug code.

A couple of random thoughts:

  • In Pangeo world, there are already two DAG-based execution engines: Dask and Prefect. (For example, right now in Pangeo, we can write xarray code which generates a Dask graph for delayed execution and then execute it with a distributed scheduler. This seems conceptually close to what OpenEO is aiming for already.) So it would be ideal if an OpenEO backend could simply translate the OpenEO workflow to one of these existing formats and then pass the execution off to an existing, mature task scheduler.
  • Xarray has become our de-facto universal API for data analysis. Xarray’s API is similar to, but distinct from, the openEO python API. Pangeo users would likely love to take advantage of OpenEO backend processing, but they probably don’t want to learn a new API. Can we somehow generate OpenEO API calls from vanilla Xarray code? This could be hard, since they use different types of abstraction. (The integration point in Xarray is with the NumPy API, which is implemented by many array libraries, e.g. NumPy itself, Dask, cupy, etc.)
  • Xcube seems like a really cool project. It already provides a REST API and CLI for interacting with xarray datacubes, and it’s part of the ESA ecosystem. Could Xcube be leveraged here, rather than starting from scratch?
2 Likes

Thanks @rabernat, appreciate the offer and your thoughts. We’ll likely get back to it.

  • It’s likely that we’ll translate into an existing format for a task scheduler. We usually don’t implement that on our own in openEO.
  • I guess that’s best discussed with the guys implementing the Python client as I would imaging that being done on the Python client level due to the fact that xarray is only known in Python world and would not benefit R/JS users so much. Maybe there’s room for further alignment. I doubt we can fully align, but maybe make it easier for users to learn the new API?
  • I have no clue, but that’s likely a point the students can investigate.
1 Like

I everyone,

I was going to open a thread on Pangeo and OpenEO when I found it was already opened almost 2 years back! Awesome.

As @rabernat said, there’s quite a bit of traction in Europe towards OpenEO, but towards Pangeo too. At CNES and other places, we’re trying to see if the two approaches could work together. As I’m really new to OpenEO concepts (only read the about page), I’ve not anything to add to what Ryan suggested.

Has there been any advances in the subjects discussed here that people know about?

Maybe @PhenoloBoy or @annefou have some thoughts if they saw some talks at the recent ESA Living Planet Symposium?

2 Likes

Hi everyone,

Indeed OpenEO in Europe, for many reasons, can be seen as a trading technology and, IMHO, within a couple of years the European ‘market’ will be flooded by it and, more specifically, by the OpenEO platform.

@geynard like you I’m indeed a newbie on this topic. I attended a couple at the LPS but all of them were more focused to present the platform than on the API; as the LPS has been quite dispersive maybe I missed the more focused one.

Indeed, from the OpenEO developers, I had the feeling that there is interest in having a dialogue with Pangeo developers but I’m not aware of any initiative.

Just cross linking to An Pangeo/ODC-based backend that runs "out of the box" · Issue #16 · Open-EO/PSC · GitHub, where the Open-EO community is discussing this.

1 Like