I am trying to download (and regrid) CMIP6 climate data download through Google Cloud.
I am roughly following the Pangeo CMIP6 tutorial
One issue I encounter is when I download and save 2D ocean variables (e.g. sea surface temperature or “tos”, or sea surface salinity or “sos”) the download finishes very quickly (within a minute). However, if I instead download and save the top level of the 3D ocean fields such as “thetao” or “so”, the code hangs and never completes. A 2D ocean field for one CMIP6 historical ensemble member is around 980MB, so it shouldn’t take very long on a decent internet connection. I am not sure why downloading from the 3D data is so much slower.
I am really not sure why there is a difference in performance here. I tried chunking the levels into chunks of size 1, but it didn’t help.
Any advice would be much appreciated! Here is the code I am using, which works for “tos” but not the top level of “thetao”. download_CMIP6_minimal_working.py · GitHub