Discrete Global Grid Systems (DGGS) use with Pangeo

Dear @clyne I’d just like to catch up on the Project Raijin standing monthly meeting on Thursdays. So April 14th, 10am MT (which time zone is MT - is that Boulder/Mountain Time, now actually MDT?) - TimeZoneMeetingPlanner

I think I could make a short appearance if possible?

Hi @allixender, that would be great. Yes the timezone is MDT (April 14, 10am MDT). Here is the Google Meets link:

Project Raijin monthly meeting
Thursday, April 14 · 10:00 – 11:00am
Google Meet joining info
Video call link: https://meet.google.com/gbn-vwdo-scb
Or dial: ‪(US) +1 503-908-2441‬ PIN: ‪962 727 640‬#
More phone numbers: https://tel.meet/gbn-vwdo-scb?pin=7536637037644

1 Like

Hi all, FYI I received the sad news that our EU Horizons DGGS proposal was unfortunately not funded. But as promised we would like to continue innovation and research on DGGS and take the interest and positive vibe to develop better tools and make DGGS more easily usable with current spatial data challenges. In that sense, a short shameless self-promotion, and hopefully of interest for you as well:

Alexander Kmoch, Ivan Vasilyev, Holger Virro & Evelyn Uuemaa (2022) Area and shape distortions in open-source discrete global grid systems, Big Earth Data, doi: https://dx.doi.org/10.1080/20964471.2022.2094926

Short summary here :slight_smile: https://twitter.com/allixender/status/1554360770825261056

Best regards,
Alex

5 Likes

Sorry to hear about the missed funding opportunity!

A question for those of you on this thread: I am trying to devise a way to create geometry-based indexes for the geoboundaries.org dataset. Essentially, today, we generate IDs based on a hash of the vector geometries. We would like to move to generating an ID based on a uniform, global grid, and then capturing what grid cells are overlapped by a given geometry.

The challenge is scale - being in the socioeconomic world, we operate at very small scales relative to the climate modeling community I think most of you represent :). Are there any global grids that can scale down to arbitrarily small levels, and are their well implemented strategies to doing this based on need? I.e., I know that some mesh approaches have fine-grained meshes in areas that are climatologically relevant for a model, but have very little insight into what the “best” solutions are to date, or if there are limitations in the ways things scale. Papers on the topic would be very welcome if you have any recommendations.

1 Like

Hey Dan, thanks for the info. Yes, the Global Grids I am referring to have reliably defined resolutions down to meters. But there the challenge is also scale, depending on how large areas you want to represent and that fine resolution. I have checked GeoBoundaries, a great resource. How could I help you, do you have some specific test case we could play through?

1 Like

@DanRufola you may want to check out the links to Uber H3 and Google S2 earlier in this thread. Both of these technologies are designed to solve global addressing down to very fine spatial resolutions.

Using either global discretization, you should be able to compute “coverings” of an arbitrary geospatial polygon. Here is an example app that lets you compute S2 coverings for simply polgyons; it’s not too difficult to work with a library exposing interfaces for S2/H3 to extend this to arbitrary polygons (at least simple ones). You can then uniquely compute a set of cells on each grid that define the cover for a polygon, and then hash that… although to be honest what you’re describing is kind of the purpose of S2/H3 in the first place, so hopefully you can just compute the list of cell IDs that cover a polygon directly and save that.

1 Like

Hi all, for those interested, there was a webinar organised by the Global Forum for Geography and Statistics on using Discrete Global Grid Systems:

Global Grids for Statistical Data

Recording (on YouTube)

  1. Kevin Sahr (Southern Oregon University) and Richard Barnes – “Getting Started with DGGRID: Everything you need to know to begin using hexagonal DGGS”

  2. Alexander Kmoch (University of Tartu) – “SAGEGRID - Spatial Analysis and Areal Statistics as a Service with Discrete Global Grid Systems”

2 Likes

Thank you for sharing @allixender! Looks like a great talk, added to my queue for later today.

Hi all, picking up the slack here a bit. We had a great discussion on the Pangeo Europe weekly coffee chat and discussed how to compare/integrate use cases of unstructured grids and DGGS grids, @annefou @tinaok . We would like to build/demonstrate a few simple and more complex workflows from some typical unstructured grids analysis steps trying to replicate those with DGGS and everything that we might stumble upon, e.g. regridding, summaries etc.

Please let’s collate some tangible examples here, which I could then try to build out on GitHub and Binder.

In addition, I would really appreciate some help from folks who have experience in building and distributing binary conda packages that require C/C++ compilation targets @benbovy @ocefpaf : I have a cmake C++ project DGGRID that requires GDAL installed as well. and creates a single executable binary. I have read lots of the conda packaging docs but I struggle. Who can help set-up a feedstock with me?

UPDATE1: I have made a reproducible docker build for DGGRID, with and without a Python base. The Python base is for subsequent testing with available open source DGGS such as H3, rHEALPix and the DGGRID Python wrapper dggrid4py.

Hi @allixender, I had a quick look at the DGGRID source. One issue is that at least three of its dependencies are vendored: proj4, clipper and shapelib. I’ll let @ocefpaf confirm, but I think this is not allowed (or at least not recommended) Guidelines — conda-forge 2022.12.06 documentation.

  • for proj4 it seems that only a small part of it has been copied and adapted, so maybe it is OK to use the vendored code (with the license file included)?
  • for clipper it’s probably OK too (see pyclipper-feedstock)
  • for shapelib there are already packages available on conda-forge, so I guess it should be enough to patch the CMakeLists.txt file (or even better submit it to DGGRID) in order to get the dependency externally.

For setting up the feedstock, you can have a look at this one: GitHub - conda-forge/richdem-feedstock: A conda-smithy repository for richdem., which also depends on GDAL. It is a little bit more complex than for DGGRID (you don’t need all the python-related dependencies) but the build.sh and bld.bat should look pretty similar.

1 Like

Thanks for the initiative @allixender! I would be very interested in a robust framework for comparing and benchmarking global grids of all sorts. First thing I would like to clarify is what you mean by ‘unstructured grid’ and generally which definition for ‘grid’ you propose to use.

We were thinking of doing that a bit more systematically. Unstructured grids I’d refer here to the previously mentioned ones :slight_smile: We now just need a place to build this out, happy to do this in public. Even better if we can get funding.

Thanks a lot for the guidance. So I’ll try to get this working for DGGRID.

Do I need to have a special conda environment with the build tools, smithy etc?

How do I get this eventually into the conda-forge ecosystem? Is it similar to publishing to PyPI or are there other mechanisms.

All you need is forking GitHub - conda-forge/staged-recipes: A place to submit conda recipes before they become fully fledged conda-forge feedstocks and submit a PR with a new folder in the recipes directory with at least meta.yaml and build.sh (and bld.bat if you want Windows packages). Once all CI passes and the PR is merged, a new feedstock repository will be created automatically for you.

There’s a build-locally.py helper script in the repository that is useful in case you don’t want to iterate too much with CI build jobs. It only needs Python (and maybe Docker).

https://conda-forge.org/docs/maintainer/adding_pkgs.html#staging-test-locally

1 Like

Sorry I missed the ping here. Still not sure how discourse works.
@benbovy is correct but it is not that it isn’t allowed but more like “not recommended.” Vendoring can happen if the package is careful enough to isolate the vendored parts and avoid breaking other packages that rely on the same dependencies.

It is usually easier to un-vendor and add them as dependencies.

Hi all,
I recently came across this project: STARE SpatioTemporal Adaptive Resolution Encoding · GitHub
I’m not affiliated and have only basic knowledge of DGGS, but it seems to be very relevant to this discussion. Here are two publications: 1, 2. The second one also specifically mentions Pangeo.

2 Likes

Thanks for the update on STARE. I actually met on the EGU conference last year myself :+1:

By the way, I see great potential, e.g. to experiment with Xvec and Xoak in the space of DGGS. Unfortunately, we still haven’t moved much. We are currently, trying to bring DGGRID onto conda-forge.

We are trying to get a demo project off the ground with Tina and some Ocean data.

1 Like

Hi, I hope the DGGRID package will be soon available in conda-forge. I have been working on the packaging in the last weeks, but mostly waiting on the package admins feedback. Pull request.

Also, DGGS paper that uses DGGRID and a multi-resolution hexagonal ISEA3H grids for Extraction of ocean tidal information based on global equal-area grid and satellite altimeter data

5 Likes

Hi, for everyone interested, I finally got DGGRID building through conda-forge. It seems I even managed to get the Windows build right.

DGGRID Feedstock

This is the latest release from Kevin Sahr’s DDGRID tool. I am now working with him trying to design some Python bindings, which should become as easily installable through conda. In the meantime, dggrid4py and DGGRID should now be nicely useable together in a conda environment.

We also made a dockerized version for a commandline use: Package dggrid · GitHub

As a side note, the package is also being developed with in Julia:

3 Likes

@allixender This is a nice discussion! I work in the computational hydrology/land surface domain, and we developed a mesh-independent framework (HexWatershed: GitHub - changliao1025/pyhexwatershed: The Python interface to HexWatershed a mesh independent flow direction model for hydrologic models) for some hydrology-related studies. We currently support MPAS mesh as the unstructured mesh, but DGGRID support is on the way. The indexing system is definitely a key factor as we need to run some simulations on a global scale. AABB tree and cython are targeted for performance improvement.

1 Like