What's Next - Software - In-Memory Performance

This is part of a follow-on conversation from the “What’s next for Pangeo” discussion that took place 2023-12-06. It is part of the overarching software topic.

To help achieve good performance in Pangeo, we should consider accelerating the slow parts of our stack. This might be done in several ways:

  1. Rust (this is what was mentioned during the call) or other low level languages (C/C++/Numba)
  2. Smarter algorithms
  3. General old-fashioned tuning

I’ll propose that before we can dedicate effort here, we probably want to do some profiling on common workloads. Do we understand where bottlenecks are?


My experience looking at holistic performance on cloud for large scale dataframe computations is that S3 access is the primary bottleneck to consider. Other parts of the stack could be 10x slower than machine performance and we wouldn’t really notice.

I think that profiling and benchmarking would be useful here. I would encourage this community to assemble a set of representative small-scale benchmarks that can help inform future development work.