Dear community,
I am facing a strange behaviour of rechunk
.
<xarray.Dataset>
Dimensions: (y: 900, x: 900, time: 720)
Coordinates:
lat (y, x) float64 dask.array<chunksize=(225, 225), meta=np.ndarray>
lon (y, x) float64 dask.array<chunksize=(225, 225), meta=np.ndarray>
* time (time) datetime64[ns] 2006-04-01T00:45:00 ... 2006-04-30T2...
Dimensions without coordinates: y, x
Data variables:
precipitation (time, y, x) float32 dask.array<chunksize=(45, 113, 113), meta=np.ndarray>
This is the head of my zarr archive stored in an s3 bucket.
I used different target_cunks
:
{'precipitation': {'y': 10, 'x': 10, 'time': 720},
'lat': {'y': 10, 'x': 10},
'lon': {'y': 10, 'x': 10},
'time': 720}
{'precipitation': {'y': 10, 'x': 10, 'time': 720},
'lat': {'y': 10, 'x': 10},
'lon': {'y': 10, 'x': 10},
'time': None}
Both of them causing a unification of the time axis:
<xarray.Dataset>
Dimensions: (y: 900, x: 900, time: 720)
Coordinates:
lat (y, x) float64 dask.array<chunksize=(10, 10), meta=np.ndarray>
lon (y, x) float64 dask.array<chunksize=(10, 10), meta=np.ndarray>
* time (time) datetime64[ns] 2006-04-01T00:45:00 ... 2006-04-01T0...
Dimensions without coordinates: y, x
Data variables:
precipitation (time, y, x) float32 dask.array<chunksize=(720, 10, 10), meta=np.ndarray>
I was able to generate a working example for you:
import xarray as xr
import pandas as pd
import numpy as np
import zarr
import rechunker
time = pd.date_range("2000-01-01", freq="6H", periods=365 * 4)
ds = xr.Dataset({"foo": ("time", np.arange(365 * 4)), "time": time})
ds.to_zarr('test.zarr')
group = zarr.open_consolidated('test.zarr', mode="r")
_ = rechunker.rechunk(
group, {'foo': {'time': 10}, 'time': None}, '1GB', 'rechunked.zarr', temp_store='tmp.zarr'
)
zarr.convenience.consolidate_metadata('rechunked.zarr')
ds = xr.open_dataset('rechunked.zarr', engine='zarr', consolidated=True)
ds will contain 365*4 the same timestamp.
I hope there is a simple solution and I am doing somethin wrong. Otherwise I hope we can fix asap.
Best regards
Daniel