Guidance on using Zarr on HPC (chunking, multi-file layouts, and node scaling for optimal CPU/memory use)

Hi all,
I’m working on an HPC workflow where we store large gridded datasets (SST-like, 3D arrays with dimensions similar to time × 3600 × 7200) in Zarr format on a parallel file system. I’m looking for any pointers to documentation, examples, or “rules of thumb” specific to using Zarr efficiently on HPC systems (POSIX/Lustre/GPFS, not cloud object storage).[github +1]
Concretely, I’m trying to understand:
• How to choose chunk sizes (e.g. time-major vs space-major, target chunk byte size like 100–500 MB, etc.) so that Zarr + Dask run efficiently on multi-node clusters without overloading memory or the metadata servers.[pangeo-data.github +2]
• How to decide between a single large Zarr store versus splitting into multiple Zarr stores (e.g. yearly or monthly) for both performance and manageability on HPC.[discourse.pangeo +1]
• How to reason about how many nodes / workers to use for a given Zarr layout (chunk size, number of files, total dataset size) so that CPU utilization is high but per-worker memory stays within limits, especially when using dask-jobqueue or similar.[gallery.pangeo +1]
• Any known pitfalls or best practices for Zarr on HPC file systems (e.g. inode limits from many small chunks, when to use ZipStore, consolidate_metadata, or larger 100–500 MB chunks, etc.).[dask +2]
If there are existing Pangeo docs, tutorials, or discussion threads that cover “Zarr on HPC best practices” or present benchmark results (e.g. recommended chunk sizes / auto-chunk settings for POSIX), links would be very helpful.