Virtualizarr with s3-compatible object storage

I have netCDF level 3 data in a ceph object store using the s3-compatible ceph object gateway. I would like to explore using icechunk and virtualizarr to manage zarr views of the data such as described here. I’m having trouble figuring out the right way to set up the reader in virtualizarr. By default I get an error which suggests AWS S3 is assumed:

In [5]: options = {
   ...:     "key": creds["AWS_ACCESS_KEY_ID"],
   ...:     "secret": creds["AWS_SECRET_ACCESS_KEY"],
   ...:     "client_kwargs": {"endpoint_url": creds["AWS_ENDPOINT_URL"]},
   ...: }

In [6]: vds = vz.open_virtual_dataset("s3://my-radosgw-bucket/myfile.nc", reader_options={"storage_options": options})

...
File .venv/lib/python3.12/site-packages/virtualizarr/utils.py:35, in ObstoreReader.__init__(self, store, path)
     31 import obstore as obs
     33 parsed = urlparse(path)
---> 35 self._reader = obs.open_reader(store, parsed.path)

GenericError: Generic S3 error: Error performing HEAD https://s3..amazonaws.com/my-radosgw-bucket/myfile.nc in 2.394350046s, after 10 retries, max_retries: 10, retry_timeout: 180s  - HTTP error: error sending request

I think my issue is similar to what is discussed here. However if I follow Tom’s example:

In [15]: vds = vz.open_virtual_dataset(
    ...:     f"{creds['AWS_ENDPOINT_URL']}/my-radosgw-bucket/myfile.nc",
    ...:     backend=HDFVirtualBackend,
    ...: )

I get a 403 and I’m not sure how to pass my S3 credentials. Any advice would be very much appreciated!

Hi @zdgriffith - you’ll get better responses to these types of usage questions either by raising an issue on the VirtualiZarr github page, or asking in our community slack channel.

ceph object store using the s3-compatible ceph object gateway

This should be totally possible with obstore, but VirtualiZarr is currently trying to be too clever and auto-inferring the AWS region, which fails because your S3 is not AWS. We’re currently working on avoiding this kind of too-clever footgun by making configuration of the obstore store the users’ responsibility instead. (See Fragility of url auto-parsing logic · Issue #561 · zarr-developers/VirtualiZarr · GitHub and Refactor codebase to support a new simplified Parser->ManifestStore model. by sharkinsspatial · Pull Request #601 · zarr-developers/VirtualiZarr · GitHub).

To make it work today with your non-AWS store you may have to reach into the code and remove this auto-parsing logic manually. Happy to help with that.

Thanks Tom! Didn’t know you had a slack channel, I’ll join that. I’ll try removing the auto-parsing logic and reach out if I hit a snag.

1 Like