Hi @Peter_Marsh - thanks for your help! This is very useful and exactly what I was looking for. And thank you for your work on the kerchunk project as a whole.
I did get the first version of your code to work for me with no problems. It is faster to create the virtual dataset rather than concatting the xarrays.
I tried to get the second code you put up working, but ran into a problem here:
mzz = MultiZarrToZarr(flist,
remote_protocol='s3',
remote_options={'anon':True},
coo_map={'ensemble' : ex},
concat_dims = ['ensemble'],
identical_dims = ['feature_id', 'reference_time', 'time'],
)
out = mzz.translate()
I get:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Input In [40], in <cell line: 8>()
1 mzz = MultiZarrToZarr(flist,
2 remote_protocol='s3',
3 remote_options={'anon':True},
(...)
6 identical_dims = ['feature_id', 'reference_time', 'time'],
7 )
----> 8 out = mzz.translate()
File ~/python3/miniconda3/envs/rain2/lib/python3.10/site-packages/kerchunk/combine.py:394, in MultiZarrToZarr.translate(self, filename, storage_options)
392 """Perform all stages and return the resultant references dict"""
393 if 1 not in self.done:
--> 394 self.first_pass()
395 if 2 not in self.done:
396 self.store_coords()
File ~/python3/miniconda3/envs/rain2/lib/python3.10/site-packages/kerchunk/combine.py:200, in MultiZarrToZarr.first_pass(self)
198 z = zarr.open_group(fs.get_mapper(""))
199 for var in self.concat_dims:
--> 200 value = self._get_value(i, z, var, fn=self._paths[i])
201 if isinstance(value, np.ndarray):
202 value = value.ravel()
File ~/python3/miniconda3/envs/rain2/lib/python3.10/site-packages/kerchunk/combine.py:150, in MultiZarrToZarr._get_value(self, index, z, var, fn)
148 o = selector[index]
149 elif isinstance(selector, re.Pattern):
--> 150 o = selector.match(fn).groups[0] # may raise
151 elif not isinstance(selector, str):
152 # constant, should be int or float
153 o = selector
TypeError: 'builtin_function_or_method' object is not subscriptable
I ran this in a separate env where I made sure I had the latest packages, and everything looks to be working prior to this point. Can you help me out again?