Do tropical cyclones from highresMIP simulations behave consistently with prior stochastic TC emulators?
Scientific Motivation
Most GCMs are unable to represent observed characteristics of tropical cyclones (TCs), e.g. frequency, intensity, and spatial patterns. This is largely due to an inability to resolve physics at the scale necessary to generate TCs. Thus, TCs are underrepresented in GCMs. Simultaneously, due to the large socioeconomic costs of TCs, there is great interest in understanding how their behavior (and resulting economic losses) will change in the future. To gain insight, researchers have developed stochastic emulators of TCs, based on historical relationships we have seen between TCs and variation in large-scale environmental factors caused by, e.g., ENSO. Such models have been validated against historical observations, which represent a relatively short record, but their performance in future climates distinct from those seen in past data is relatively unknown. Nevertheless, they are used to forecast changes in many socioeconomically relevant measures, such as storm surge heights, maximum wind speeds, and resulting economic damages.
Within the HighResMIP simulations, the resolutions of these models are approaching that which should reasonably resolve TC dynamics. Using these models as a physically-based reference, we may be able to validate some of the future tendencies of statistical TC emulations. This could greatly effect the validity of emulator-based future projections of socioeconomic impact from changing TCs.
Proposed Hacking
In this project, I propose to build infrastructure to begin comparing the catalogue of tropical cyclones contained within HighResMIP to that of statistical emulators. Note that much of the HighResMIP model output appears to not yet be available on the ESGF repository, but several models are available and we can begin creating a scalable framework to ingest other models as they are uploaded.
Hacking could include the following (trying to keep the 3-day agenda fairly short and sweet, since it will inevitably expand into more work than anticipated):
- adapting TempestExtremes (or another tracking algorithm) to identify TCs within the CMIP6 models. My experience with this tool is limited, but I’m guessing that appropriately tuning this could be a fairly significant endeavor.
- setting up scalable processing in the Pangeo environment for TE across the output from various models
- developing a suite of relevant summary statistics resulting from these TE outputs (e.g. frequencies, intensities, cumulative rainfall distributions by basin and decade)
- (potentially) comparing these distributions to those from a statistical emulator (see data needs below)
Anticipated Data Needs
- 3 hr (potentially can use daily?) u, v, sea level pressure, rainfall from as many HighResMIP models as possible
- comparison TC tracks from statistical emulator. Note that I have many of these from an ongoing project, but I am not yet sure if I will be able to share/use these during the hackathon. It’s possible that we may need to develop the GCM processing infrastructure during the hackathon and pursue the full comparison as part of a longer-term effort incorporating other collaborators.
Anticipated Software Tools
- tempestExtremes
- dask
- xarray
- scikit-learn and/or statsmodels and/or other statistical python packages
Desired Collaborators
Anyone interested in TCs and/or socioeconomic impacts. Also anyone wishing to work with a variable-tracking algorithm like tempestExtremes. Also anyone experienced with data processing in dask/xarray.