open_mfdataset
You’re right that we can and should do better here. Help would be appreciated! I’m interested to know what @jrbourbeau has been doing.
Query optimization
Understood. For anyone who wants to follow this the relevant github issue is here Integrate Dask Arrays properly · Issue #446 · dask-contrib/dask-expr · GitHub.
Bounded-Memory Rechunking
Sounds great! Looking foward to it.
Benchmarking
“coiled” in the name it’s pretty open. Benchmarks are triggered by Github Actions, source is open, people outside of Coiled have write permissions, it benchmarks projects that aren’t Dask
Okay that is very useful to know - I think I was operating under an unexamined assumption that this had to be a separate effort. I will talk to @tomwhite about that.
Do you personally have time to do some work here? Maybe @dcherian from Earthmover? Maybe someone else?
I don’t know yet.
Merge groups
Sure? What is the concrete step that’s being proposed here? Are you asking someone in particular to attend a regular meeting?
I guess I’m re-extending the invitation for anyone who wants to come to the distributed arrays working group meetings (one-off or regular), and clarifying there are multiple common performance questions that we could be collaborating on as part of that group. I’m also happy to organise special meetings to discuss the above common topics.