I think we should have options in the function:
The simplest case would be binning using xhistogram (pros: fast, would work on non-monotonic target coordinates). If I read the documentation of xhistogram, this would not need much more work except for maybe a wrapper that bins all fields of a model dataset in the same manner according to a target DataArray. @rabernat, am I seeing this correctly or does xhistogram still need anything for this functionality?
The more involved case consists of two steps (I am sourcing these from the MOM6 code:
- Interpolation and creation of a new grid based on target criterion(e.g. density).
- Remapping of quantities between old grid and new grid.
The grid is defined as an array of depth values z_old
(these are the cell boundaries). We can then define a new grid of depth points (which could be defined by any tracer that is monotonic in depth, e.g. density) z_new
.
z_old_(0) z_old_(1) z_old_(n-1) z_old_(n)
|-------------------------|--------------.....-------------|---------------------------|
<--------------------->
dz_old
z_new_(0) z_new_(1) z_new_(n-1) z_new_(n)
|---------------|--------------------......-----------|---------------|
<----------- >
dz_new
Now comes the fun part. The interpolation defines us new cells (consecutive pairs of z_new
values a ‘left’ boundary z_L
and a ‘right’ boundary z_R
.
There are three principal cases listed in the MOM6 code:
The target cell is entirely contained within a cell of z_old.
This situation is represented by the
following schematic :
z_old_(n+m) z_old_(n+m+1)
----|-----o--------o----------|-------------
z_L z_R
<------>
dz_new
The target cell spans at least two cells of the source grid.
This situation is represented by the following schematic:
z-old_(n-1) . z_old_(n) z_old_(n+m) z_old_(n+m+1)
----|---------------o---|----------- ... ---------|---o-------------------|-------------
z_L z_R
<----------------------------->
dz_new
They mention a third case where z_L = z_R
, but I dont think that is relevant to offline regridding (correct me if I am wrong, please).
Now we need to remap the values between z_L
and z_R
and construct a new spacing coordinate dz_new
. This can be done in a variety of ways, but I think as a first attempt it is sufficient to assume th values in the ‘subcells’ (the ones split by zL
or zR
) have the same tracer concentration as the source cell.
The challenge will be to code this efficiently, and perhaps there is a nifty way of using xhistogram and weights to do this? We should look at xesmf to get some inspiration. I remember seeing somewhere that this calculation was achieved using a large matrix of weights?