I have netCDF level 3 data in a ceph object store using the s3-compatible ceph object gateway. I would like to explore using icechunk and virtualizarr to manage zarr views of the data such as described here. I’m having trouble figuring out the right way to set up the reader in virtualizarr. By default I get an error which suggests AWS S3 is assumed:
In [5]: options = {
...: "key": creds["AWS_ACCESS_KEY_ID"],
...: "secret": creds["AWS_SECRET_ACCESS_KEY"],
...: "client_kwargs": {"endpoint_url": creds["AWS_ENDPOINT_URL"]},
...: }
In [6]: vds = vz.open_virtual_dataset("s3://my-radosgw-bucket/myfile.nc", reader_options={"storage_options": options})
...
File .venv/lib/python3.12/site-packages/virtualizarr/utils.py:35, in ObstoreReader.__init__(self, store, path)
31 import obstore as obs
33 parsed = urlparse(path)
---> 35 self._reader = obs.open_reader(store, parsed.path)
GenericError: Generic S3 error: Error performing HEAD https://s3..amazonaws.com/my-radosgw-bucket/myfile.nc in 2.394350046s, after 10 retries, max_retries: 10, retry_timeout: 180s - HTTP error: error sending request
I think my issue is similar to what is discussed here. However if I follow Tom’s example:
In [15]: vds = vz.open_virtual_dataset(
...: f"{creds['AWS_ENDPOINT_URL']}/my-radosgw-bucket/myfile.nc",
...: backend=HDFVirtualBackend,
...: )
I get a 403 and I’m not sure how to pass my S3 credentials. Any advice would be very much appreciated!