I’ve successfully followed the NetCDF to Zarr Sequential Pangeo Forge tutorial on an AWS-hosted JupyterHub with the source NetCDFs hosted in a private S3 bucket. This workflow involves caching each file locally before performing the conversion. Is there a way to read the NetCDFs directly from S3 in a recipe and bypass the caching step (without deploying a Bakery)?
3 Likes
Reminder to self: RTM. For anyone else who ends up here, set cache_inputs=False
when creating the recipe:
recipe = XarrayZarrRecipe(pattern,
inputs_per_chunk=100,
cache_inputs=False)
1 Like
Yes, it is definitely possible. You should be able to just return s3://
or https://
urls in your FilePatten. DId cache_inputs=False
solve your problem?
1 Like
It sure did! Thanks for checking @rabernat.
1 Like