Status code: 404 on NOAA OISST data on Pangeo-Forge


I am trying to access NOAA’s OISST AVHHR data stored on the cloud; however, instead of seeing the usual block of code where I can copy and paste to simply run and load in the dataset in my notebook, I am seeing this error message:

An error occurred while fetching data from URL:
{"detail":"An error occurred while fetching the data from URL: Dataset not found."}

The dataset seems to not live where it’s supposed to and here is the link to the dataset on pangeo-forge’s catalog. I am having trouble tracking down the maintainer for this dataset so I would appreciate anyone’s help on resolving this matter. Thanks!

:wave:t5: @stb2145, the dataset itself is fine ( this is a kerchunk-ed dataset ""), and can be accessed via xarray using the following code snippet.

In [1]: import xarray as xr

In [2]: url = ""

In [3]: ds = xr.open_dataset("reference://", engine='zarr',
   ...:                      backend_kwargs={'consolidated': False,
   ...:                                      'storage_options': {'fo': url, 'remote_options': {'anon': True}, 'remote_protocol': 's3'}},
   ...:                      chunks={})

In [4]: 

In [4]: ds
Dimensions:  (time: 15044, zlev: 1, lat: 720, lon: 1440)
  * lat      (lat) float32 -89.88 -89.62 -89.38 -89.12 ... 89.38 89.62 89.88
  * lon      (lon) float32 0.125 0.375 0.625 0.875 ... 359.1 359.4 359.6 359.9
  * time     (time) datetime64[ns] 1981-09-01T12:00:00 ... 2022-11-08T12:00:00
  * zlev     (zlev) float32 0.0
Data variables:
    anom     (time, zlev, lat, lon) float32 dask.array<chunksize=(1, 1, 720, 1440), meta=np.ndarray>
    err      (time, zlev, lat, lon) float32 dask.array<chunksize=(1, 1, 720, 1440), meta=np.ndarray>
    ice      (time, zlev, lat, lon) float32 dask.array<chunksize=(1, 1, 720, 1440), meta=np.ndarray>
    sst      (time, zlev, lat, lon) float32 dask.array<chunksize=(1, 1, 720, 1440), meta=np.ndarray>
Attributes: (12/37)
    Conventions:                CF-1.6, ACDD-1.3
    cdm_data_type:              Grid
    comment:                    Data was converted from NetCDF-3 to NetCDF-4 ...
    date_created:               2020-05-08T19:05:13Z
    ...                         ...
    source:                     ICOADS, NCEP_GTS, GSFC_ICE, NCEP_ICE, Pathfin...
    standard_name_vocabulary:   CF Standard Name Table (v40, 25 January 2017)
    summary:                    NOAAs 1/4-degree Daily Optimum Interpolation ...
    time_coverage_end:          1981-09-01T23:59:59Z
    time_coverage_start:        1981-09-01T00:00:00Z
    title:                      NOAA/NCEI 1/4 Degree Daily Optimum Interpolat...

the issue on has to do with some hardcoded assumptions in the codebase used to preview the dataset, and is being tracked in this issue Add kerchunk opener to `repr` route · Issue #200 · pangeo-forge/pangeo-forge-orchestrator · GitHub


Great, thank you so much for the quick help, Anderson! I guess I was just missing the /reference.json part…