[Pasted here by request when talking about kerchunk…]
I’d like to take this moment to explain the difference between “parallelism” and “concurrency” and why we are doing both. This comes up a lot and is lost on most, but very cool for those that get it.
“Parallelism” (preemptive multitasking, usually means dask) is when multiple CPU cores are doing work. It can loosely also refer to multiple threads on a single CPU, but in that case, only one is really running at a time. If you have 8 cores, you might get 8x the work IF CPU cycles are your limit - which is not the case for IO. However, if you have 8 whole machines, you might well get 8x the network throughput and so 8x the performance for data loading. Even on one machine and one network interface, IO will presumably be followed by processing, so parallelism is still critical.
“Concurrency” (cooperative multitasking, usually means asyncio) is when you have some tasks that are waiting on external events most of the time. This is waiting on the server to send stuff. You can have a huge number of tasks waiting. Since requesting data from remote has a high latency (time until first byte), you can wait on all the requests at once, so you pay the latency cost once, however many requests are in flight. This is citical for many small reads, where for each request, the data transfer time is a small fraction of the request wall time.
So kerchunk/zarr in dask allows for independent, parallel, tasks (perhaps distributed in a cluster) loading many chunks each concurrently.