Unable to open binary data with xm.open_mdsdataset

Hello,
I have recently tried to open (MITgcm) binary data using xmitgcm package in ipython. However I am getting an Assertion Error which I have no idea where it is coming from. An example of a data file of temperature is named like this:
TT_in_R.0007730880.data
TT_in_R.0007730880.meta

and when I use low-level utilities to have a look on the files inside (using ipython) I get the following:

TOTF = mds.rdmds('TT_in_R.0007730880')
TOTF
array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       ...,
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan]])

Similarly for the .meta file:

TOTm = mds.readmeta('TT_in_R.0007730880.meta')
TOTm
((1999, 1999),
 [0, 0],
 [1999, 1999],
 [7730880],
 None,
 None,
 {'nDims': [2],
  'format': ['float32'],
  'nrecords': [1],
  'dimList': [1999, 1999]})

However, when I try to open the same file with the xm.open_mdsdataset I get the following error:

datatot= xm.open_mdsdataset(data_dir='/home/PROGRAMS/MLHB_DATA',prefix=['TT_in_R'],iters=[7730880],delta_t=90,ref_date='1979-01-15 00:00')

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-47-b8dba3ae9da7> in <module>
----> 1 datatot= xm.open_mdsdataset(data_dir='/home/sofi/Documents/POSTDOC_ATHENS/PROGRAMS/MLHB_DATA',prefix=['TT_in_R'],iters=[7730880],delta_t=90,ref_date='1979-01-15 00:00')

/opt/anaconda3/3.8.10/envs/MHWENV/lib/python3.9/site-packages/xmitgcm/mds_store.py in open_mdsdataset(data_dir, grid_dir, iters, prefix, read_grid, delta_t, ref_date, calendar, levels, geometry, grid_vars_to_coords, swap_dims, endian, chunks, ignore_unknown_vars, default_dtype, nx, ny, nz, llc_method, extra_metadata, extra_variables)
    271                 return ds
    272 
--> 273     store = _MDSDataStore(data_dir, grid_dir, iternum, delta_t, read_grid,
    274                           prefix, ref_date, calendar,
    275                           geometry, endian,

/opt/anaconda3/3.8.10/envs/MHWENV/lib/python3.9/site-packages/xmitgcm/mds_store.py in __init__(self, data_dir, grid_dir, iternum, delta_t, read_grid, file_prefixes, ref_date, calendar, geometry, endian, ignore_unknown_vars, default_dtype, nx, ny, nz, llc_method, levels, extra_metadata, extra_variables)
    583         for p in prefixes:
    584             # use a generator to loop through the variables in each file
--> 585             for (vname, dims, data, attrs) in \
    586                     self.load_from_prefix(p, iternum, extra_metadata):
    587                 # print(vname, dims, data.shape)

/opt/anaconda3/3.8.10/envs/MHWENV/lib/python3.9/site-packages/xmitgcm/mds_store.py in load_from_prefix(self, prefix, iternum, extra_metadata)
    658 
    659         try:
--> 660             vardata = read_mds(basename, iternum, endian=self.endian,
    661                                llc=self.llc, llc_method=self.llc_method,
    662                                extra_metadata=extra_metadata, chunks=chunks)

/opt/anaconda3/3.8.10/envs/MHWENV/lib/python3.9/site-packages/xmitgcm/utils.py in read_mds(fname, iternum, use_mmap, endian, shape, dtype, use_dask, extra_metadata, chunks, llc, llc_method, legacy)
    205     # get metadata
    206     try:
--> 207         metadata = parse_meta_file(metafile)
    208         nrecs, shape, name, dtype, fldlist = \
    209             _get_useful_info_from_meta_file(metafile)

/opt/anaconda3/3.8.10/envs/MHWENV/lib/python3.9/site-packages/xmitgcm/utils.py in parse_meta_file(fname)
     49     needed_keys = ['dimList', 'nDims', 'nrecords', 'dataprec']
     50     for k in needed_keys:
---> 51         assert k in flds
     52     # transform datatypes
     53     flds['nDims'] = int(flds['nDims'])

AssertionError: 

I suspect that this AssertionError has something to do with the prefix definition, as I have tried to run the command just by specifying data_dir and prefix.

In addition I also tried to include a โ€œ.โ€ inside the filename in prefix and I got the following:

datatot= xm.open_mdsdataset(data_dir='/home/PROGRAMS/MLHB_DATA',prefix=['TT_in_R.'],iters=[7730880],delta_t=90,ref_date='1979-01-15 00:00')
datatot
<xarray.Dataset>
Dimensions:  (time: 1, XC: 2000, YC: 2000, XG: 2000, YG: 2000, Z: 50, Zp1: 51, Zu: 50, Zl: 50)
Coordinates: (12/29)
    iter     (time) int64 7730880
  * time     (time) datetime64[ns] 2001-02-01
  * XC       (XC) >f4 30.01 30.02 30.03 30.04 30.05 ... 0.0 0.0 0.0 0.0 0.0
  * YC       (YC) >f4 10.01 10.02 10.03 10.04 10.05 ... 0.0 0.0 0.0 0.0 0.0
  * XG       (XG) >f4 30.01 30.02 30.03 30.04 30.05 ... 0.0 0.0 0.0 0.0 0.0
  * YG       (YG) >f4 10.01 10.02 10.03 10.04 10.05 ... 0.0 0.0 0.0 0.0 0.0
    ...       ...
    hFacC    (Z, YC, XC) >f4 dask.array<chunksize=(50, 2000, 2000), meta=np.ndarray>
    hFacW    (Z, YC, XG) >f4 dask.array<chunksize=(50, 2000, 2000), meta=np.ndarray>
    hFacS    (Z, YG, XC) >f4 dask.array<chunksize=(50, 2000, 2000), meta=np.ndarray>
    maskC    (Z, YC, XC) bool dask.array<chunksize=(50, 2000, 2000), meta=np.ndarray>
    maskW    (Z, YC, XG) bool dask.array<chunksize=(50, 2000, 2000), meta=np.ndarray>
    maskS    (Z, YG, XC) bool dask.array<chunksize=(50, 2000, 2000), meta=np.ndarray>
Data variables:
    *empty*
Attributes:
    Conventions:  CF-1.6
    title:        netCDF wrapper of MITgcm MDS binary data
    source:       MITgcm
    history:      Created by calling `open_mdsdataset(grid_dir=None, iters=[7...

It seems that it is reading all the grid files but finds no variable to read.
Anyone any ideas?

Thank you in advance for your time and help,
Sofi

1 Like

Hi Sofi,
Thanks for your question. Would you mind posting it on the xmitgcm github issue tracker. That is the best place to get support for this:

OK thanks. I just did it. thanks.

1 Like