Help with concurrent.futures

Can anyone provide a simple example (using map() or submit() as here with ThreadPoolExecutor that actually processes data from objects and delivers a speedup when max_workers is set > 1?

I’m struggling to find a real example where the benefit is there. (The example in the documentation, with max_workers = 5 does indeed run faster <1sec vs ~3sec). But I can’t find a case where this actually is beneficial with array data. I only need something really basic, I’m hoping to flush out something naive I’m doing. Thanks!

(I’ve searched here in this forum, the only instance of ‘ThreadPoolExecutor’ I can find points to a repo that does not actually use concurrent.futures (any longer).)

Thread-based parallelism in Python only helps if the tasks you are running release the GIL. This is not the case in general for most python code, but it is true for NumPy and other optimized libraries.

If you can provide a more verbose and reproducible example of what you’re doing, you will probably receive better advice from this forum.

Perhaps the reason that ThreadPoolExecutor is not very popular here is because Zarr v3 uses async S3Fs under the hood, meaning that it is still a single-threaded operation (as in logical threads, not hardware CPU “hyper” threads).

Multithreaded operations work well with I/O tasks, such as downloading from S3, but I’m guessing the init penalty makes it a bad idea for Zarr.. ~10 threads for concurrent downloading from S3 works well in one of my cases.

Also, you mentioned “process data”… are you sure it’s ThreadPoolExecutor and not ProcessPoolExecutor you need? If your system is quad-core and your CPU utilization is stuck at ~25%, then it’s not using all the cores.

EDIT: strikethrough

1 Like

Anywhere zarr-python 3.x uses asyncio.to_thread (anything working with codecs, for example) it’s moving work off the main thread, where the event loop is running, to a ThreadPoolExecutor attached to the asyncio event loop. Since this is typically using numcodecs, which typically releases the GIL, you’ll get the speedup.

1 Like