Hiya amazing Pangeo folk, I’m working at UKCEH (UK Centre for Ecology and Hydrology, quite a mouthful…!) and a group of us are getting to grips with object/cloud storage and parallel cloud compute of big/huge datasets.
As part of this we’ve been experimening with getting some of our existing datasets onto AWS object storage and learning (oftentimes the hard way!) what does and doesn’t work with the tools and techniques and best practices etc. We have now got a sort-of workflow setup, but it is very disconnected and not straightforward. Ultimately we want to create a straightforward and easy-to-follow & adapt workflow and guide to help others in our organisation and further afield get on-board with this new way of working and these new tools etc.
I’ve been lurking on the edge of Pangeo for a while now, slowly reading up, on and around things, and today I started looking at Pangeo Forge in earnest and it seems like this does essentially what we want - create a straightforward recipe for converting an existing dataset to be ARCO and making it publically accessible. The one thing I’m a bit stuck on is how the compute resource works. I can see the conversion jobs can be submitted to bakeries, but I can only see one bakery on Pangeo-Forge Bakeries and it seems to have jobs on there that have been pending for months. Is this still operational, and are there other bakeries that could be used, and if not, is it possible/plausible to use some of our own compute resource? (After all, up to this point we have been doing everything locally!).
But that aside, Pangeo and it’s forge seem like fantastic resources and I’m excited to get more into it over the coming !!