New Working Group for Distributed Array Computing

We’re forming a new working group to explore alternative parallel computing frameworks to dask, such as apache-beam/cubed/ramba.

I propose we try a bi-weekly hour-long meeting, starting next week. If you’re interested can you please fill out this doodle poll for times that work for you.

See also:

6 Likes

I don’t know if I’ll be able to make an hour every other week, but I’ll try. 30 minutes would be more doable.

Maybe an hour was too much - lets start by meeting at the best time from the poll and see what length we think is sufficient.

1 Like

The time selected is every other Monday at 1pm ET. The first meeting with be Monday, Sept. 19.

The working group has been added to the calendar and published on the Pangeo website: Meeting Schedule and Notes — Pangeo documentation

GitHub - pangeo-data/distributed-array-examples is a repository where we’re collecting examples of challenging distributed array computations.

1 Like

Ah sorry to have missed it! (the one time I don’t check notifications on a weekend!)

I added some brief comments in red to the notes.

1 Like

Good to keep us on our toes.

As a heads-up, the Dask folks are Coiled are collecting workloads for large scale benchmarking to help inform development. We’ve gotten significantly more data-driven in the last few months. I think that James Bourbeau is aware of the array examples repo Tom points to above. If anything else arises that would help to guide things in Dask-land please speak up.

Thanks all

3 Likes

Ugh…I was expecting a notification of the selected time the same way I got the initial email so sorry I missed the first one. I’ll be there for the next one. Thanks for posting the notes.

1 Like

Reminder that we have another meeting at 1pm EST TODAY (so in 45 minutes’ time)

Today we had a great intro to Ramba from Todd and Babu, and an overview of cubed from Tom White - thanks everyone!

On the 31st October we’re going to have a presentation on Arkouda from Scott Bachman!

1 Like

Here is the link I promised in the meeting about standardizing a partitioning API.

1 Like

Reminder that this will happen again today at 1pm EST!

1 Like

We’re on again today at 1pm EST, when we will have a talk by Frank Schlimbach on the partitioning API!

1 Like

Today (at 1pm EST) we have Deepak talking about his dask algorithms for groupby problems using his package flox!

We don’t have a speaker today, so unless someone has something they would like to discuss with the group or present last minute I think we should cancel today and pick it back up after Christmas! That would make the next meeting 9th January.

Sounds good to me.

Perhaps a good topic for next time would be “What’s needed for a full end-to-end workflow with cubed/Ramba” organized around your xarray+cubed PR?

2 Likes

What are the plans for this working group in the new year? @everyone

Hey @Todd_Anderson ! Thanks for the reminder. I got completely distracted by conferences / workshops the last few weeks.

“What’s needed for a full end-to-end workflow with cubed/Ramba” organized around your xarray+cubed PR?

I could talk about this if people are interested? Though it’s not finished, but would be useful information and should start some useful discussion as to API standards etc.

1 Like

IMO this should be our goal (and a great SciPy talk :wink: )

1 Like