Pystac_client cannot load STAC catalog

Previously, I’ve used the element84 S2 catalog thus:

from pystac_client import Client
catalog = Client.open("https://earth-search.aws.element84.com/v0")

I’m now wanting Sentinel-5P data managed by MEEO. The stated STAC endpoint is STAC Browser. So I tried

Client.open('https://meeo-s5p.s3.amazonaws.com/index.html?t=catalogs')

This produces a JSONDecodeError: Expecting value: line 1 column 1 (char 0).

I’ve tried downloading one of the catalog.json files (and requests.get) and using the Client.from_dict() method, but that seems to fail because there’s no ‘type’ key. So I manually frig the dict from the catalog.json file with

catjson['type'] = 'Catalog'
catalog = Client.from_dict(catjson)

which gets over that hurdle, but when I then try

[c for c in catalog.get_all_collections()]

I get AttributeError: 'NoneType' object has no attribute 'conforms_to'

So either I’m misunderstanding how to “get into” the STAC catalog for Sentinel-5P here, or else the STAC catalog is broken. This community is my first friendly port of call, but if someone could kindly sanity check me or tell me where I’ve gone wrong, I’ll happily approach the right people if the issue is with MEEO or pystac or elsewhere.

I know that Client.open() can work from json files, thus

tmp = Client.open('https://eoepca.github.io/open-science-catalog-metadata/catalog.json')
[c for c in tmp.get_all_collections()]

happily beavers away and prints a list of collections.

1 Like

I think that you’re running into the difference between a STAC catalog and a STAC API. I believe that STAC Browser is just a static catalog.

I expected the following to work, using pystac (not pystac-client), but it raises an error:

In [1]: import pystac

In [2]: pystac.read_file("https://meeo-s5p.s3.amazonaws.com/catalog.json")
---------------------------------------------------------------------------
STACTypeError                             Traceback (most recent call last)
Cell In[2], line 1
----> 1 pystac.read_file("https://meeo-s5p.s3.amazonaws.com/catalog.json")

File ~/src/pc/pc-onboarding-pipelines/.direnv/python-3.10.6/lib/python3.10/site-packages/pystac/__init__.py:145, in read_file(href, stac_io)
    143 if stac_io is None:
    144     stac_io = StacIO.default()
--> 145 return stac_io.read_stac_object(href)

File ~/src/pc/pc-onboarding-pipelines/.direnv/python-3.10.6/lib/python3.10/site-packages/pystac/stac_io.py:228, in StacIO.read_stac_object(self, source, root, *args, **kwargs)
    208 """Read a STACObject from a JSON file at the given source.
    209
    210 See :func:`StacIO.read_text <pystac.StacIO.read_text>` for usage of
   (...)
    225     contained in the file at the given uri.
    226 """
    227 d = self.read_json(source, *args, **kwargs)
--> 228 return self.stac_object_from_dict(
    229     d, href=source, root=root, preserve_dict=False
    230 )

File ~/src/pc/pc-onboarding-pipelines/.direnv/python-3.10.6/lib/python3.10/site-packages/pystac/stac_io.py:159, in StacIO.stac_object_from_dict(self, d, href, root, preserve_dict)
    154     # Merge common properties in case this is an older STAC object.
    155     merge_common_properties(
    156         d, json_href=href_str, collection_cache=collection_cache
    157     )
--> 159 info = identify_stac_object(d)
    160 d = migrate_to_latest(d, info)
    162 if info.object_type == pystac.STACObjectType.CATALOG:

File ~/src/pc/pc-onboarding-pipelines/.direnv/python-3.10.6/lib/python3.10/site-packages/pystac/serialization/identify.py:252, in identify_stac_object(json_dict)
    249 object_type = identify_stac_object_type(json_dict)
    251 if object_type is None:
--> 252     raise pystac.STACTypeError("JSON does not represent a STAC object.")
    254 version_range = STACVersionRange()
    256 stac_version = json_dict.get("stac_version")

STACTypeError: JSON does not represent a STAC object.

It might be worth checking with the maintainers of that project to see whether their STAC catalogs are valid. Glancing at the JSON, I notice it’s missing a type field, which is required according to the Catalog spec: stac-spec/catalog-spec.md at master · radiantearth/stac-spec · GitHub.

1 Like

Thanks Tom. I was wondering if there was a distinction between API and static. I’d noticed the missing type and added it to a dict, which fixed that problem but moved on to another. I’ll reach out to them.

1 Like

Update on this. It took a few days to get an initial reply, but I heard back from someone at MEEO, who was then quite responsive. If you’re interested in using S3 or S5P on AWS, they are updating their STAC catalog. It looks like it will be through their adamplatform domain.There’s currently a dev instance. The prod instance will likely have a different URL, they say (or URLs) - not yet defined. Dev instance API due to be exposed over the coming weeks. So it seems MEEO are still very much in the game for S3 and S5P and we’ll have happy days soon on AWS with COGs. Yay.

1 Like