I’m working through the Pangeo SOSE example using the jupyter notebook on github which I’ve copied into the Pangeo cloud environment.
Everything runs smoothly until I try to load the data in the ‘Validate Budget’ section, on the following line:
Each time I try this, the dask dashboard shows that work gradually slows and then stops. I’ve tried leaving it for over 10 minutes but it doesn’t start again, and have also tried using different sizes of servers.
Things are still changing in the Worker tab, and when I check the worker logs there are errors, including:
ERROR - Decompression failed: corrupt input or insufficient space in destination buffer. Error code: 12
ERROR - Invalid size: 0x3318063313
ERROR - failed during get data with tls://10.8.21.3:40055 → tls://10.8.23.4:40111
I’m no data scientist, so I’d be very grateful if anyone had some insight into what’s happening or how I could debug it! I also appreciate that this code hasn’t been updated for a while so am not necessarily expecting to be able to fix the problem here, however, I am running into the same issue when working on my own project with the SOSE data so I think there is something more general going on.
Any help would be very appreciated, and apologies if I’ve missed anything obvious or posted this in the wrong place, I’m still new to all of this!
And also move to the new link @rabernat provided above. As you said, it is much faster for processing. But still failed to finish the whole processing.
If I close the cluster, I can run this load() code with 2 min. It means the loading data are not very huge and easy to process in simple way. So, I have this question: under what condition should we use dask?
I also have another question: where dose the dask memory come from? For example, the max memory for Pangeo user is about 60G. But I can give dask more than 60G (such 40*2=80G) and it still works.
I was using a higher worker memory (8) and cluster.adapt(). I got a bit further through the notebook, but hit errors again when computing the histograms later on.
It’s interesting what you say about just loading the data without using the dask cluster, I tried this too and am finding it faster and more reliable.
Yes. The histograms fails too when not use the dask. It is perhaps due to the large memory cost.
I just comment the histograms section and things go well for the below codes.