Best way to wrap scipy.correlate in time

itss · July 16, 2022, 9:20pm

Hi,

I hope I am using this forum in a correct way.
I am struggling with a thing that should be simple. Calculate a correlation!
But I have >300Gb of data and this gets very tricky.

I have a big matrix A that has the following dimensions (time,kx,ky) = (400,4096,4096). The first thing I would like to do is to calculate an autocorrelation in time for each kx and ky, but I would also like to be able to do the same for cross-correlation, as I actually have many matrices like A that I would like to correlate with.

To give an example, I will create an arbitrary smaller matrix A that is basically a cosine in time adding a noise that depends on the magnitude of k. In this example, the fake dataset will be loaded in the memory, but in my real-world example I am loading the datasets from netcdf files using xr.open_dataset.

import xarray as xr
import numpy as np

time = xr.DataArray(np.arange(200), dims=["time"])
kx = xr.DataArray(np.arange(1000), dims=["kx"])
ky = xr.DataArray(np.arange(1000), dims=["ky"])

K = np.abs(kx + 1j*ky)

A = (
    (np.cos(time)*np.exp(-time/50))+
    ((np.random.randn(time.size)*xr.ones_like(time)*K/2e3))
).assign_coords(time=time, kx=kx, ky=ky)

Anyone has any idea on what would be the fastest way to wrap scipy.signal.correlate in a way that I could give any two matrices like A and return a matrix C(lag,kx,ky) without loading everything in the memory all at once?

The fastest way I could do was to select and load each kx and then loop over ky selecting each time series, running scipy.signal.correlate and then saving into a numpy.array that later becomes my xarray.DataArray object C(lag,kx,ky).

dcherian · July 17, 2022, 4:58pm

See if xskillscore has something you can use.

How is your data stored on disk?

itss · July 17, 2022, 5:12pm

I will take a look. By now, each A is stored as a single nc file.

aaronspring · July 17, 2022, 7:17pm

xarray.corr or xskillscore.pearson_r

Topic		Replies	Views
Struggling with large dataset loading/reading using xarray Science	39	15428	February 16, 2023
HLS time series using xarray best practices	8	270	October 14, 2024
Reshape time series into 2 dimension matrix in xarray Science	4	1197	October 26, 2022
How to save out-of-memory Xarray file to .csv Data	3	919	July 26, 2023
Xarray unable to allocate memory, How to "size up" problem Data location-uw	9	3081	July 27, 2023

Best way to wrap scipy.correlate in time

Related topics