On-premise Pangeo Forge Bakery setup?

Hi all

I just came across the Pangeo Forge project and am very interested in deploying this setup on-premise in our institute.

We host a bunch of climate data for our users that we modify/ unify as well.

I plan to base the ETL process on Pangeo Forge recipes, and schedule them via prefect.io (that we also just started using for our data pipelines. We also host a S3 object store on premise (NetApp StorageGrid). The compute could either happen on our SLURM-based HPC (daskexecutor) or one a K8 cluster that will be up next spring (hopefully)…

Seems we have all the components in place to replicate the AWS/ Azure bakery setup - but on premise, no?

Is there any info or guidance for setting this up locally? Would there be major showstoppers that I just don’t see?

Cheers and keep up the great work


Christian, this is a great idea. We fully hope to enable this sort of activity–a custom Bakery running on your own infrastructure.

However, the project is not yet in a place where we can give much guidance on what to do. We are still sorting out the orchestration aspects of the Pangeo Forge federation. We don’t really have any bakeries operational at this point! Right nows the automation model is pretty tied to Prefect, so you would probably need to get a Prefect agent running.

Please feel free to join the next Pangeo Forge coordination meeting to discuss this topic.