Hi,
For more context see further down, but my question is essentially this:
If I do a xr.open_mfdataset
on a bunch of netcdf files, then some rechunking, then a ds.to_zarr
- will xarray read and write one variable at a time (as a worst case, if rechunking means we need an entire variable in memory at a given time)?
But that’s not how it seems to work. Could anyone bother looking through my rapid calculation and tell me where I’m wrong, and maybe even what I could improve?
Thanks in advance!
I had a job running overnight that did essentially this:
with xr.open_mfdataset(netcdf_files) as ds:
ds = ds.chunk({'siglay_dim': 34, 'siglev_dim': 35, 'time': 24*120, 'node': 100, 'nele': 100})
ds.to_zarr('path/to/zarr')
where I’d reserved
#SBATCH --ntasks=4 --cpus-per-task=1 --mem-per-cpu=32G
=128 GB for the job which I thought was plenty, but it still got killed b/c it ran out of memory.
The files contain 24 time steps each, are about 9.9 GB, in total something like 6TB.
The individal variable in each file is about 45 MB, so for 600 files we’re talking about
600*ds.temp.nbytes/1024**2
= 27 GB, which I though is smaller than 32 GB per task so I should be good.
<xarray.Dataset>
Dimensions: (nele: 28173, node: 14658,
siglay_dim: 34, siglev_dim: 35,
three: 3, time: 24, maxnode: 11,
maxelem: 9, four: 4)
Coordinates:
x (node) float32 4.881e+05 ... 4.839e+05
y (node) float32 7.636e+06 ... 7.624e+06
lon (node) float32 0.0 0.0 0.0 ... 0.0 0.0 0.0
lat (node) float32 0.0 0.0 0.0 ... 0.0 0.0 0.0
lonc (nele) float32 0.0 0.0 0.0 ... 0.0 0.0 0.0
latc (nele) float32 0.0 0.0 0.0 ... 0.0 0.0 0.0
* time (time) float32 5.812e+04 ... 5.812e+04
z (siglay_dim, node) float32 -0.01611 ......
zlev (siglev_dim, node) float32 0.0 ... -5.003
Dimensions without coordinates: nele, node, siglay_dim, siglev_dim, three,
maxnode, maxelem, four
Data variables: (12/87)
nprocs int32 640
partition (nele) int32 535 535 535 ... 214 214 214
xc (nele) float32 4.881e+05 ... 4.84e+05
yc (nele) float32 7.636e+06 ... 7.624e+06
siglay_center (siglay_dim, nele) float32 -0.0008845 ....
siglev_center (siglev_dim, nele) float32 0.0 ... -1.0
... ...
Attributes: ...