Coordinating efforts on Lagrangian filtering for SWOT

Background

This thread was motivated by an exchange on twitter:

This exchange reminded me that that are quite a few people working on the SWOT filtering problem, with a variety of approaches and models.

Within this broader topic, many of us are working on applying “Lagrangian filtering”, a computationally intensive technique for separating waves and other high-frequency motions from the low-frequency background flow. In the oceanography we often use this beautiful 2017 paper by Shakespeare and Hogg as a reference for the method. Earlier uses of this framework go back to Polzin and Lvov (2011) and Nagai et al (2015).

For many SWOT-related projects, the Lagrangian filtering is only the beginning; the end goal is to use theory and / or data-driven methods to remove the IGW signal from the SWOT measurements, leaving behind the transport-relevant part of the flow. However, the Lagrangian filtering step is very computationally intensive, involving modeling a huge volume of particles, applying signal processing in the Lagrangian frame, and then interpolating back to Eulerian space. There is an opportunity to work together to bring Pangeo tools to bear on this problem.

Proposal

I know lots of the groups involved are eager to take a collaborative approach to this problem.
It is very challenging, and perhaps we can be more effective by working together on certain aspects.
Also, there is a practical concern: we have students and postdocs who are expecting to publishing novel research in this direction, and so it may be wise for project mentors to coordinate efforts to minimize accidental duplication / overlap.

Let’s use this forum as a place to openly discuss these related efforts, our scientific goals, and technical approaches. I would love to find new avenues for collaboration and interoperability between our efforts. COVID has forced us all to rethink how we do scientific collaboration, since we are no longer very constrained by geostrophic proximity.

If you’re interested in trying this approach, I’d propose that each group make a post outlining their goals, their data / models, and their computational tools. I’ll start now with a post on behalf of our our SWOT sub-team.

1 Like

NYU / Columbia SWOT Sub-team: Inferring Ocean Transport from SWOT

Our team consists of myself and @cspencerjones at Columbia, plus Shafer Smith and a TBN graduate student at NYU.

Project Goals and Approach

Our project title “Inferring Ocean Transport from SWOT,” summarizes our central aim. Our outlines three specific goals related to this theme:

  • inferring lateral surface velocities from SSH at scales where geostrophy accounts for only part of the signal
  • filtering the transport-active balanced component of the velocity field, in order to give meaningful estimates of the surface eddy kinetic energy and infer lateral stirring rates
  • connecting vertical fluxes of tracers to estimates of transport-active lateral velocities, and ultimately directly to SSH.

Building on earlier (yet still unpublished) work by former student Anirban Sinha, we will be trying out some neutral-network-based approaches to the filtering problem.

Key to all of this is to have a “truth” signal of raw vs. filtered SSH and near-surface velocities. We plan to use model data apply the Lagrangian filtering technique of Shakespeare and Hogg. We would like to produce a quality synthetic dataset that can be used broadly by the SWOT community and beyond, similar to what Nicolas described in his tweet above.

Data / Models

We will be using the MITgcm LLC4320 runs as our testbed. These data include hourly snapshots of 3D velocities and velocities. Eventually, we will also use SWOT simulator to produce synthetic SWOT observations.

We are also doing some experiments in an idealized high-resolution channel setup to test the methods, before moving to the big model.

Tools and Methodology

We are planning to use MITgcm in offline mode to advect the Lagrangian particles on the Pleiades supercomputer. We are doing this because:

  • This operates well with the existing LLC4320 data on Pleiades, requiring minimal data transformation / preparation
  • It scales up very well, since MITgcm is parallelized with MPI. We expect we can handle 100 billion particles.

We will run the particles at 15m depth in 2D. This will also give us an opportunity to follow up on the earlier work by Yu et al. comparing with drifters. (Aurelien Ponte, @selipot, and others were involved in that.)

Once the particle trajectories are output, we will convert them to an analysis optimized format (either Parquet or Zarr). We will analyze them using the Pangeo stack, including xrft for the spectral filtering part. We have not yet figured out the best way to do the regridding.

Once we have produced the raw + filtered datasets, we hope to share them with the community. At that point we will move on to the more scientifically exciting part of assessing different data-driven methods for reconstructing the transport-relevant velocities from SWOT SSH.

Possible Collaboration Points

I think, but am not certain, that there are several other groups trying to apply Lagrangian filtering in a similar way. It would be great to share technical notes, and possibly even code, for doing this. I’d also love to get some of the data in the cloud via Pangeo so we can more fluidly collaborate.

I would also love to be involved in defining “challenge” problems for machine learning, as Nicholas described in his tweet. I think the data we plan to produce will be very appropriate for this. By coordinating efforts, I think we can get more visibility for this from the ML community. For example, we could create an “Ocean ML Hub,” similar to Radiant Earth’s MLHUB:

Most of all, I really want to take a collaborative approach and make sure that we don’t actually get into a situation where we are competing with other teams who I would much rather be working together with. :blush:

1 Like

Looks like a fascinating project @rabernat. I am thinking that the “total” velocity estimates from drifters will become again useful for ground truthing model results or compare to velocity estimates derived from SWOT data? I missed an opportunity to propose that to the SWOT call but maybe for the future.

1 Like

We would absolutely love to collaborate with you on that aspect, although I know you might not be able to devote much time to it without funding. (Perhaps we could even try to find some supplementary funding somewhere for this specific task.)

I heard via email from Nicolas and Julien…they want to get involved but won’t have time to engage much on this forum for at least a few weeks. No rush…we will be standing by whenever folks are ready to discuss.

Thanks a lot Ryan for kick-starting this much-needed coordination effort.

We have a number of ongoing and future activities at Ifremer that are relevant to this.

A Ph.D. student (Zoé) is currently investigating the signature of internal tides on surface Lagrangian trajectories in idealized numerical simulations (that of Nico’s tweet actually).
The study is not using lagrangian filtering per se, but is obviously very connected.
We’re are wrapping up this study this fall and should submit a manuscript by the end of the year.

We are currently producing a dataset of surface drifter trajectories from llc4320 surface velocity fields.
Data production should be concluded by the end of the month.
Technically, this production relies on parcels with dask parallelization on overlapping tiles.
It is our intention to communicate about this in the months to come and investigate whether our development could be useful for the community.
Our intention was also to share this dataset but given the fact that your group @rabernat has access to the full dataset (and can advect drifters at depth vs at the surface for example) and that you are planning on performing Lagrangian filtering, this may be a worthless effort.

This dataset will be used for two studies:

  • Xiaolong (Yu) will be extending his study with the model lagrangian vs model eulerian comparison
  • Zoé will be exploiting the results she obtained in an idealized setting to a realistic one.

Your thread has to my opinion a broader scope than that of Lagrangian filtering alone: that of extracting information from SWOT data and drifter trajectories.
For llc4320 drifter tracks, we are actually not intending to apply Lagrangian filtering even though the processing we are implementing often bear similarities.
I would thus say there are not one but at least two relevant datasets (trajectories vs Lagrangian filtered dataset).

1 Like

@rabernat - thanks for taking the lead on this. A quick update from me on recent work that has been going on – mostly led by Callum Shakespeare. Callum will likely chime in with more information.

The main goal right now is to create a tool which is easier to apply than the original matlab scripts that Callum hacked together in 2017. Angus Gibson has been working on a method based on OceanParcels. It is python-based, freely accessible, has decent documentation and is in the process of being written up in a paper. Have a look here if you are interested:

Right now, we think this algorithm is ready for others to use and give feedback or contribute to the code.

One question is whether a python-based tool is suitable of the LLC4320 type data. I think it might be - because of the method used. In short, the idea is that, for a given time and at every model gridpoint, we seed a particle and advect it forward & backwards by X days, then apply the filter to fields on that Lagrangian particle. This gives filtered output at the time of seeding (but not at any other time). This is well-explained in the documentation. But the point is that you can apply the filter to small regions, and in a per-timestep fashion, which essentially makes it embarrassingly parallel. @rabernat - I would be very interested in seeing if this method could handle the LLC data??

1 Like

Following from Andy’s introduction, I thought I’d just note some of my observations with working in the dask/parcels area, and where things could possibly be improved. These only pertain to the problem of actually obtaining the Lagrangian-space data through particle advection. Once we have these tracks, we can pass them into the filter of our choice, either a spectral method, or using scipy’s signal processing and filter design functionality.

As mentioned, seed a particle at every gridpoint at which we want filtered data. I have played with running the parcels advection through dask workers, but I’m not particularly happy with the way I’m doing it (see here).

In the problem of distributed advection using parcels’ deferred fields, we have two types of tasks: the actual advection tasks, which run some number of advection timesteps for a group of particles; and the loading of chunked data from a source xarray dataset. Focusing first on the advection tasks, they’re quite tightly-integrated: parcels maintains a sparse chunk map of data required to interpolate the particle data. Every time a parcels kernel requests interpolation, it first checks whether it has loaded that chunk, and if not it will force a load (by inserting the numpy array for that chunk into a dictionary, essentially).

So, because of the tight integration of parcels kernel execution, we can’t cleanly use them directly as distributed tasks. I ended up creating the xarray dataset, and the parcels FieldSet as part of the advection task itself, since we can’t share this between workers. It’s pretty likely that we load the same chunk on multiple workers once the particles move from their initial positions.

I think this is where we would be able to get the best improvements: instantiate parcels (and our data) only once, and push the distributed work further down. I thought that perhaps the recent addition of dask-aware interpolation to xarray might help here, but this doesn’t support some of the more complex (e.g. curvilinear, B/C-grid) interpolation present in parcels.

1 Like

Thanks all for this very interesting discussion. Within my team, we don’t work on Lagrangian filtering per se, but of course we do support a lot of the Parcels code development. Really useful to read what limitations in the code you are facing, @angus-g and @apatlpo. Let us know if there’s anything we can help with from our side.

1 Like

Thanks everyone for the replies! I want to note that Julien and Nicolas plan to contribute as well but are currently tied up with other stuff. So we should definitely wait for their inputs.

In the meantime, a few in-line replies from me

Definitely not a worthless effort! But coordinating these sorts of expensive / difficult calculations is precisely the goal of this thread. My view is that multiple implementations using different tools is quite valuable, if only as a check that the methods converge to the same answer. We would be happy to host your data in Pangeo Cloud if / when you are ready to share it with a broader community. That will help facilitate intercomparison. With our Pleiades-based calculations, we could try to compare several different depths, so we can make a direct comparison with the parcels results. But you are also far ahead of us. We don’t even have our Pleiades allocation yet.

Yes, very good points. My thinking is that we would start with the data perparation and move on to the more “interesting” inference problems. It’s useful to know you are not intending to apply Lagrangian filtering, as this is a small but important distinction in scope.

Thanks so much @AndyHoggANU and @angus-g for sharing this. It seems like such a powerful and useful approach. Exactly the sort of tools I’m hoping we can collaborate around.

I want to understand you approach better. My thinking about how to do this problem on a large scale was basically a two-step process:

  • Advect Lagrangian particles and write the particle data to disk
  • Process the particle data (filtering, whatever) and write some new results to disk (e.g. filtered, interpolated velocities)

It sounds to me that you are doing something more clever, which is to essentially do the filtering “online” during the particle advection, such that you never have to write the full Lagrangian dataset to disk, but instead only output the final filtered results. Is this correct?

That approach really only works with parcels, where you can more easily plug in custom python code.

The primary challenge of the LLC data is the complex topology of the connections between faces. That’s really the reason we went with the MITgcm offline advection; because it wrote the velocity data, it can easily read it back and handle this topology. We have also implemented these connections in XGCM; if one wanted to support this type of topology in parcels, that might be a good starting point.

Another advantage of MITgcm offline is that it really does scale and parallelize well on supercomputers.

Since we are stuck using MITgcm, we are also stuck with using the two-step method for now. However, one possible way forward, if we are willing to neglect the arctic, would be to us xmitgcm to transform the data on-the-fly to a rectangular pseudo-lat-lon grid (basically just keep the LL part of LLC). That’s described a bit here:

It’s embarassingly parallel from a computational perspective, but as @angus-g correctly notes, managing the data is another matter. Especially with datasets like LLC4320, it’s likely that I/O is the rate limiting factor. There are lots of things we could experiment with here. One idea is to use a distributed in-memory database like redis to cache the chunks. That way, the expensive I/O operation (read from disk / object storage) only has to be done once.

Another option might be to look at other workflow frameworks, like Prefect, that might be better suited than Dask to this problem.

I’m going to look through the filtering code you shared more closely and perhaps I’ll have some ideas.

Yes, that is correct. This method overcomes a problem that you will soon strike. In the 2D velocity field at a given level, there is divergence and convergence. So, over time, irrespective of how many particles you start with, the particle density thins out in some regions, and you have no Lagrangian trajectories to work with. So, the time over which you integrate Lagrangian trajectories has to be relatively short (I would say weeks, certainly not a year). For this reason, having a smaller number of particles but ensuring their density is even at the mid-point of the trajectory is more efficient…

I will let @angus-g comment on whether his implementation can handle the topology of the LLC grid…

1 Like

@AndyHoggANU @rabernat Just to follow up on Andy’s comment. It is possible to seed the particles in such a way to try and counteract the divergence of the 2D flow with satisfactory results. An example is in our recently published paper here: https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2020JC016106. But, it takes much trial and effort to get the seeding correct, which takes a lot of human time and effort, and the resolution is not guaranteed. Also, you need more particles in total so you can “over-seed” in regions of divergence. Therefore, ultimately, we jettisoned that method in favour of the method Andy describes which guarantees no loss of resolution, and the ability to obtain wave fields at any point in space and time independently. If you have 10 mins spare, I talk about the new method here: https://www.youtube.com/watch?v=VVqCDeL8AcQ&feature=youtu.be

1 Like

Thanks everyone for weighing in so far. It’s great to be having this discussion openly. I think we can implement the forward/backward strategy in MITgcm-offline pretty easily.

University of Toronto (and colleagues) sub-team: internal tides+(sub)mesoscale interactions through SWOT’s lens

Sorry for making you wait and I hope it will be worth it. And thank you Ryan for making this happen. And sorry for the lack of URLs in my references. I cannot have more than two, and had to remove a bunch.

Project Goals and Approach

Our project is mostly funded by the Canadian Space Agency [14SUSWOTTO] and the Natural Sciences and Engineering Research Council of Canada [RGPIN-2015-03684]. Because several grants overlap, our goals have somewhat grown, but the main focus is to address the following question: how can we isolate the internal tide signature present in a given SSH snapshot? This is likely to be important for SWOT data, whose resolution will go beyond mesoscale and mode-2 internal tide resolutions. Consequently, it is likely to access regimes in which the increasing non-stationarity of the scattering (sub)mesoscale flow will make standard internal tide retrieval procedures hard or impossible to apply. And even if my hypotheses eventually prove to be wrong, it will remain a fun question. We are starting to provide (or aiming to; it is a big baby!) partial answers on two fronts (pun intended).

  1. During her post-doc, Han Wang will test a deep-learning approach we have been toying with on-and-off for ~1.5 years to extract the internal tide signal from synthetic SSH snapshots. To do so, we have been leveraging a small numerical dataset with sufficient temporal resolution to extract the tidal signal with frequency filtering, and train the algorithm with both the raw and tidally-filtered SSH snapshots. Although our method is completely different, we are essentially after a product that would look similar to recent work by Torres et al. (2019).
  2. The PhD thesis of Jeffrey Uncu takes a diametrically different approach from the first one: instead of trying to diagnose the product of an interaction we do not understand well, we try to understand very well the process without worrying about applicability to SWOT products in the next few years. In particular, Jeff is studying how internal tides and (sub)mesoscale motions interact. His goal is to have something fundamental to say about the submesoscale range, which as of 2020 appears inaccessible to standard analytical methods (asymptotics, etc). He is mostly using single-layer shallow-water simulations. We expect these interactions to be relevant for the coastal regime, which is rich in submesoscale motions and high-mode internal tides.

Our team

  • Han graduated from NYU last Summer, and was working on the wave-vortex decomposition of Bühler, Callies & Ferrari (2014). In particular, she devised a way to apply it to Lagrangian observations (cue to @selipot), which is in revision, and we have been wondering about ways to make it useful for SWOT on the side. She joined our group in October 2020.
  • Jeff is a Ph.D. candidate in my group, and has started to work on it in 2018.
  • Alice Nuz and Michael Poon were two undergraduate students who actually came up and elaborated on the DL method, Han has been using so far.
  • My role is to try to keep up and be cheerful. I also prepare a half-decent coffee.

We are also collaborating with Hesam Salehipour (soon to be at WHOI) for the technical DL aspects, @apatlpo provided the data we have been working with in Han’s project and with whom we have been discussing theoretical aspects of Jeff’s project since the beginning. Jody Klymak (University of Victoria in BC) is co-PI on the CSA grant, with an observational aspect somewhat outside the scope of this post.

Tools

We are still working on the DL algorithm, which we will make public on GitHub along with whatever publication we submit. Our timeline for submission is in a few months, and because it took us a while to zero in on an architecture, I ask for your patience regarding the specifics.

We have been testing our DL algorithm on re-entrant channel simulations with ROMS, created and provided by Aurélien, and curated by myself (it was the object of the original tweet, available here). This output contains surface expressions of relevant quantities (e.g., SSH, SST, surface vertical vorticity…) and mixed mode-1 internal tide signals with a QG-ish turbulent flow growing out of a baroclinicially unstable jet. Each field comes in three flavours: raw snapshot, internal tide (derived by fitting the signal to tidal-frequency sin/cos functions, and a low-passed equivalent (see Ponte & Klein 2015).

In the future, and provided the method passes enough tests, we intend to apply it to more comprehensive and realistic datasets, such as those that folks in this thread are using (MITgcm, etc.). Some sort of fast-slow decomposition would need to be either available, or doable (i.e., high enough time sampling, but from the looks of it, everyone here seems to be working with such data). We do have access to a nested MITgcm simulation output that Brian Arbic and Dimitris Menemenlis ran in Toronto (I was part of one study that came from this collaboration, see Nelson et al. 2020), but would rather use datasets that are widely used, or have the potential to be.

Jeff is mostly using his brain, single-layer shallow water simulations with Dedalus and scaling arguments.

Possible Collaboration Points

Our DL method will not be directly applicable to SWOT unless we significantly modify it, train it with very realistic data, and uglify it with the SWOT simulator. Each of these steps would be a significant endeavour, and we could use outside help.

As for the project of Jeff, theoretical work is hard to plan, but if people have been thinking about this problem, we’d be happy to share our thoughts. We have found that the dynamics in this regime are extraordinarily rich, and there is no way we could write the same paper twice on this.

And about the data challenges: sure, why not. My only concern is that Han and Jeff are pumped, so, I just want to make sure that everyone is OK with the prospect of losing. :wink:

1 Like

Meant to also tag @JeffreyUncu and @hannnwang :hugs:

1 Like

Thanks Nico for the detailed response. And sorry for the link limits! It’s an anti-spam feature–those limitations should relax after your first post, so you might be able to post some links as replies…

It sounds like your team has made some amazing progress and in many ways is far ahead of the pack. I’m very excited to see your results when they are ready to share! Btw, the aim here is not to pressure people to share unpublished / in-progress results, but simply to try to organize and coordinate more broadly.

One specific question I have about your work question of Eulerian vs. Lagrangian filtering.

My understanding of both Torres’ method and yours is that the “frequency filtering” occurs in the Eulerian frame. Much of our conversation up-thread, however, focused on the importance and technical implementation of filtering in the Lagrangian frame. Are you confident that you have the right “truth” signal with which to be training a DL algorithm if you are not doing Lagrangian filtering? I know you have thought this through quite deeply, so I suspect there is something I’m missing, e.g. a scaling argument which shows why Lagrangian filtering is unnecessary…

Or maybe you are doing Lagrangian filtering after all?

1 Like

Good question.

I cannot speak for Torres et al., and did not think of it when I read their paper tbh. To be completely honest, I had not thought deeply about it in our case, and so, this answer is a first response based on intuition and guesses that I am moderately confident about.

My take on our specific project is that when it comes to tidal signals (M2 specifically, which is significantly higher than f at mid-latitudes) and mesoscale flows, which is what the dataset I published mostly deals with, the frequency separation means that tidal peaks are narrow and stand out over the continuum. In that case, @apatlpo’s fitting of a sine and a cosine over brief periods of time (don’t remember on top of my head if it is 2 days or 2 tidal periods) should be enough.

On the other hand, from what I recall in @CallumJShakespeare and Andy’s (nope, still a new user :smiley: ) paper is that the Lagrangian filtering is crucial to extract near-inertial signals embedded in intensely turbulent flows. In the context of someone who is interested in transport, it makes sense to take this approach.

And of course, my simple explanation will break down when Rossby numbers increase, and the temporal footprint of the turbulence will encroach on the tidal domain while presumably broadening tidal peaks due to the strong interactions Jeff is studying (note that at this point, I am speculating and extrapolating from what I know about near-inertial waves). The latitude might also play a role.

But your question was a tad more specific: how do I know, which I interpret as, do I have a metric to quantify my hand-waving? I actually don’t, only various smell tests. This is why I hope that Han’s practical approach and Jeff’s theoretical approach to feed on one another and provide some answers. And participating in this discussion. :wink: Because as you noticed, the Eulerian way is fairly common. If at the end of the day, it turns out that we were wrong all along, or wrong in some regimes, we will know by comparing our results. And if it happens, I would rather be in the room (this one?) that day.

Note that if working off of more complete fields, the volume of literature to separate waves from the rest is dizzying, mostly Eulerian, and poorly interconnected. There is the Bühler/Callies/Ferrari method, Eden’s group in Hamburg have been playing with other methods based on projecting on the polarization relations of the waves (e.g., Chouskey et al. 2018), the McGill group have their own ideas, there is this Whitehead & Wingate (2014) paper that has been on my to-read list for years, and which seems to also project fields onto sets of eigenvectors. And each of these papers is a massive piece of work, and I am forgetting many more groups. I don’t know if anyone has a clear idea of who uses what and why, and whose method is better for which regime, or with which type of data. But if someone else here has this sort of bird’s eye view, I’d be happy to know.

4 Likes

Sounds like someone should write a review paper on separation methods :wink:

2 Likes