Hi all
I just came across the Pangeo Forge
project and am very interested in deploying this setup on-premise in our institute.
We host a bunch of climate data for our users that we modify/ unify as well.
I plan to base the ETL process on Pangeo Forge recipes, and schedule them via prefect.io
(that we also just started using for our data pipelines. We also host a S3 object store on premise (NetApp StorageGrid). The compute could either happen on our SLURM-based HPC (daskexecutor) or one a K8 cluster that will be up next spring (hopefully)…
Seems we have all the components in place to replicate the AWS/ Azure bakery setup - but on premise, no?
Is there any info or guidance for setting this up locally? Would there be major showstoppers that I just don’t see?
Cheers and keep up the great work…
Christian