Hey Pangeo community! I met some of you at Scipy 2022 and it was recommended to me to post my question here.
In short I’m looking for big environmental data sets to analyze. I’m finishing the third year of my postdoc in statistics at UT Austin and I’m working on (among other things) statistical data assimilation.
Working with my advisor, Matthias Katzfuss, we developed several scalable methods for filtering and smoothing big spatio-temporal data sets. Check out my google scholar profile for the references.
Unfortunately, I don’t know too many earth scientists and I often struggle to find data to apply our methods to (and illustrate our papers with a real-life use case). It would also be great to talk to some potential users, to find out what kinds of methods/features they need most.
So far, most of my work was in filtering or smoothing the data within the context of a state space model (not necessarily Gaussian or linear). But in these models one needs to know some physics to derive the evolution equation from. This is what we struggled with. We found a lot of model that were too complicated for us to use or, on the other end of the spectrum, ones which were exceedingly simple (close to a random walk). Ideally we would like something in between - moderately complicated but interesting.
Anyways, the tools we developed (based on Gaussian process approximations) can be applied to purely spatial problems to. If you think this is somethign we could collaborate on, let me know!
If you’re interested, you can find more information on my website: home or just email me at marcin.jurek@austin.utexas.edu