Dask Summit 2021

Hi All!

Dask is organizing a user summit (cheekily called the dask distributed summit due to its remote nature). We’d love to have Pangeans present work.

For context, Pangeo has historically run into many problems slightly before other domain science groups, and so talks from folks here tend to be highly informative for the other Dask user communities.

Additionally, this year we’re organizing workshops, which are curated sets of talks along a theme. I think that a Pangeo workshop at the Dask User Summit would have high value.

I wrote up some thoughts on this here: Dask User Summit 2021

The broader conference page is at https://summit.dask.org

3 Likes

Thanks for posting Matt! This looks like a lot of fun.

Would you be interested in a talk on Rechunker?

Sure! I think that that is a great example of what this is about. I’ve recommended that tool to other folks in other communities that had rechunking issues.

I would also like to encourage some Pangean to propose a workshop. This might be a good place to get pangeo folks together to talk about scale challenges with more of an earth science flair than might otherwise be typical at a dask conference.

1 Like

The workshop sounds like a great idea! I personally don’t have the bandwidth to organize it. Any volunteers? Maybe @jbusecke, @paigem, or @andersy005?

1 Like

I’m happy to (co)-chair a pangeo session if anyone else is interested and wants help.

2 Likes

Dask Summit sounds like a great event! I’d be interested in helping organize a Pangeo session, though I’m still pretty new to a lot of these tools so I’d be bringing in more of a new user’s perspective.

2 Likes

I think that a new user’s perspective would be welcome :slight_smile:

My guess is that Tom can help support with technical expertise if necessary.

Great! And I just noticed you give a special shout-out to new dask users in your blog post @mrocklin - thanks for the encouragement! :blush:

It sounds like our workshop will be organized around the theme of Pangeo, i.e. geoscience use cases of dask. Reading from the dask summit website, we also need to decide on a format for the workshop: series of talks, panel discussion, working session, etc. My first thought is that a series of talks might be best to highlight a few different dask use cases within the session.

Beyond the topic and format of the workshop, I’m not entirely clear what else we need to prepare for a proposal. @mrocklin I assume we should provide a brief summary and description of the workshop in our proposal, as required for talks and tutorials?

2 Likes

@TomAugspurger, @paigem … I’m happy to help organize this session. Please let me know how I can help. Meanwhile, I’ll start reaching out to some potential speakers :slight_smile:

2 Likes

Thanks Anderson. I’ll send you and Paige an email to get things started. If anyone else wants in on organizing then let me know.

1 Like

Hi @mrocklin always interested in learning more about Dask, I’d like to get a little group together here (UTC+8) but not sure of the time zone that you’ll be working in is there a proposed timetable?

1 Like

Hi @NickMortimer !

It’s a great question and it has come up in other domains. We’re trying to organize groups of talks together in order to give subcommunities a space to engage. Scheduling is more about these groups of talks than it is about the conference as a whole. If you can find a small set of related talks that you think would make a good Pangeo-Oceanea/APAC session then by all means, let’s have two. I think that it would be great to have full time-zone coverage.

Hi @NickMortimer - the good news is they’re planning to work out a timetable based on where everyone is, the bad news is you might need to organize a bunch of other people to make an Asia-Pacific cluster.

I’m in life sciences, not geosciences, but I did talk to a bunch of geoscience people interested in Dask at last year’s PyConAU. I’m happy to put you in touch if that isn’t already the case.

Also, if pangeo wants to go for full time-zone coverage, I think Matt’s suggestion of finding talks for an APAC session is close but probably not the right approach. Talks are great but more passive than interactive. So it might make more sense to have that content be available asynchronously, then spend synchronous time on questions and interactive discussion.

I’ve been thinking about how to run a remote-first workshop for the life sciences, and what we think we’ll do is:

  • Have pre-recorded talks, made available a week before the summit
  • Asynchronous text chat (ideally available both before and after the summit dates)
  • Synchronous interactive discussions, at times that are friendly for overlaps with USA/Oceania, Europe/Oceania, USA/Europe.

Ok I’ve had a quick in-house chat and will put my hand up to get something up and running in Australia. @GenevieveBuckley just watching your Microscopium: Interactive Exploration of Large Imaging Datasets | SciPy 2019 | Genevieve Buckley - YouTube Would be great to work with you to get this going. I’m based in Perth. @mrocklin what’s the next step to progressing the project?

I’m very glad to hear it! Instructions to propose a workshop are at https://summit.dask.org/present/#guidelines . You would then in parallel start to assemble folks that you think would make a good workshop. The program committee is happy to help with this if necessary.

Also, it goes without saying, but it’s nice to see more than one effort in APAC timezones. Hopefully this helps to build a critical mass.

@NickMortimer I can’t figure out how to send you a DM (maybe I don’t have permissions for that).

Can you share your contact details with me here: Dask life science contact form

I’ll email you & we can go from there. Proposals for the Dask Summit close this Sunday or Monday (March 21st, I’ll check the timezone). I’m very happy to help you get a proposal together by then.

Hi everyone,

Are there talks from European users scheduled somewhere? I think for french users, having a session on Dask (and Pangeo) on HPC would be nice, but I’m not sure I’ve got the bandwith to organize this. I’ve got several use cases in mind though, that triggers some specific problems on tunning Dask.

Anybody interested? @fbriol @apatlpo @auraoupa @lesteve @jlesommer @PhenoloBoy ?

Hi @geynard and everyone,
with @auraoupa, we have some material on a performance analysis of Dask based analysis on HPC systems. We are considering submitting an abstract for the summit. Having a session on Dask on HPC would be great indeed. Limited bandwidth for organising this here too, unfortunately…