Hello, I am looking for an efficient way to extract data with multiple slices in a list. The best way to show this is with an example -
# Create the data array
data = xr.DataArray(
np.random.rand(5, 6),
dims=["time", "variable"],
coords={
"time": np.arange(5),
"variable": np.arange(6),
},
)
# Define slices
slices = [slice(0, 2), slice(1, 3), slice(2, 4)] # Same length and continous slices
# Select the data for each slice
sliced_data = [data.isel(time=slc).assign_coords(time=np.arange(slc.stop - slc.start)) for slc in slices]
# Concatenate along the new dimension 'window_dim'
result = xr.concat(sliced_data, dim='window_dim')
print(result)
OUTPUT:
<xarray.DataArray (window_dim: 3, time: 2, variable: 6)>
array([[[0.33547378, 0.67330893, 0.69904389, 0.88787631, 0.26807342,
0.07760665],
[0.78355031, 0.6135081 , 0.75868513, 0.16590802, 0.71739294,
0.42383822]],
[[0.78355031, 0.6135081 , 0.75868513, 0.16590802, 0.71739294,
0.42383822],
[0.01768034, 0.5773279 , 0.09635795, 0.0637734 , 0.63216361,
0.78761642]],
[[0.01768034, 0.5773279 , 0.09635795, 0.0637734 , 0.63216361,
0.78761642],
[0.4377235 , 0.42413106, 0.16612197, 0.1085243 , 0.35388582,
0.47942606]]])
Coordinates:
* time (time) int64 0 1
* variable (variable) int64 0 1 2 3 4 5
Dimensions without coordinates: window_dim
I am wondering how to do this without the for loop or xr.concat(). I have tried using rolling() on the data like so -
rolling_data = data.rolling(time=len(data.time)-total_steps, center=False).construct('window_dim')
data2 = rolling_data.transpose('window_dim', 'time', 'variable', 'y', 'x').isel(time=slice(len(data.time)-total_steps-1,None))
But when I choose a certain index/slice of indices, and convert to numpy before I pass it to my machine learning model, it takes forever. Thus, my idea is to use rolling indices and extract the data but as you can see it requires the for loop which is not the best idea.