Previously, I’ve used the element84 S2 catalog thus:
from pystac_client import Client
catalog = Client.open("https://earth-search.aws.element84.com/v0")
I’m now wanting Sentinel-5P data managed by MEEO. The stated STAC endpoint is STAC Browser. So I tried
Client.open('https://meeo-s5p.s3.amazonaws.com/index.html?t=catalogs')
This produces a JSONDecodeError: Expecting value: line 1 column 1 (char 0)
.
I’ve tried downloading one of the catalog.json files (and requests.get) and using the Client.from_dict()
method, but that seems to fail because there’s no ‘type’ key. So I manually frig the dict from the catalog.json file with
catjson['type'] = 'Catalog'
catalog = Client.from_dict(catjson)
which gets over that hurdle, but when I then try
[c for c in catalog.get_all_collections()]
I get AttributeError: 'NoneType' object has no attribute 'conforms_to'
So either I’m misunderstanding how to “get into” the STAC catalog for Sentinel-5P here, or else the STAC catalog is broken. This community is my first friendly port of call, but if someone could kindly sanity check me or tell me where I’ve gone wrong, I’ll happily approach the right people if the issue is with MEEO or pystac or elsewhere.
I know that Client.open()
can work from json files, thus
tmp = Client.open('https://eoepca.github.io/open-science-catalog-metadata/catalog.json')
[c for c in tmp.get_all_collections()]
happily beavers away and prints a list of collections.
1 Like
I think that you’re running into the difference between a STAC catalog and a STAC API. I believe that STAC Browser is just a static catalog.
I expected the following to work, using pystac
(not pystac-client), but it raises an error:
In [1]: import pystac
In [2]: pystac.read_file("https://meeo-s5p.s3.amazonaws.com/catalog.json")
---------------------------------------------------------------------------
STACTypeError Traceback (most recent call last)
Cell In[2], line 1
----> 1 pystac.read_file("https://meeo-s5p.s3.amazonaws.com/catalog.json")
File ~/src/pc/pc-onboarding-pipelines/.direnv/python-3.10.6/lib/python3.10/site-packages/pystac/__init__.py:145, in read_file(href, stac_io)
143 if stac_io is None:
144 stac_io = StacIO.default()
--> 145 return stac_io.read_stac_object(href)
File ~/src/pc/pc-onboarding-pipelines/.direnv/python-3.10.6/lib/python3.10/site-packages/pystac/stac_io.py:228, in StacIO.read_stac_object(self, source, root, *args, **kwargs)
208 """Read a STACObject from a JSON file at the given source.
209
210 See :func:`StacIO.read_text <pystac.StacIO.read_text>` for usage of
(...)
225 contained in the file at the given uri.
226 """
227 d = self.read_json(source, *args, **kwargs)
--> 228 return self.stac_object_from_dict(
229 d, href=source, root=root, preserve_dict=False
230 )
File ~/src/pc/pc-onboarding-pipelines/.direnv/python-3.10.6/lib/python3.10/site-packages/pystac/stac_io.py:159, in StacIO.stac_object_from_dict(self, d, href, root, preserve_dict)
154 # Merge common properties in case this is an older STAC object.
155 merge_common_properties(
156 d, json_href=href_str, collection_cache=collection_cache
157 )
--> 159 info = identify_stac_object(d)
160 d = migrate_to_latest(d, info)
162 if info.object_type == pystac.STACObjectType.CATALOG:
File ~/src/pc/pc-onboarding-pipelines/.direnv/python-3.10.6/lib/python3.10/site-packages/pystac/serialization/identify.py:252, in identify_stac_object(json_dict)
249 object_type = identify_stac_object_type(json_dict)
251 if object_type is None:
--> 252 raise pystac.STACTypeError("JSON does not represent a STAC object.")
254 version_range = STACVersionRange()
256 stac_version = json_dict.get("stac_version")
STACTypeError: JSON does not represent a STAC object.
It might be worth checking with the maintainers of that project to see whether their STAC catalogs are valid. Glancing at the JSON, I notice it’s missing a type
field, which is required according to the Catalog spec: stac-spec/catalog-spec.md at master · radiantearth/stac-spec · GitHub.
1 Like
Thanks Tom. I was wondering if there was a distinction between API and static. I’d noticed the missing type and added it to a dict, which fixed that problem but moved on to another. I’ll reach out to them.
1 Like
Update on this. It took a few days to get an initial reply, but I heard back from someone at MEEO, who was then quite responsive. If you’re interested in using S3 or S5P on AWS, they are updating their STAC catalog. It looks like it will be through their adamplatform domain.There’s currently a dev instance. The prod instance will likely have a different URL, they say (or URLs) - not yet defined. Dev instance API due to be exposed over the coming weeks. So it seems MEEO are still very much in the game for S3 and S5P and we’ll have happy days soon on AWS with COGs. Yay.
1 Like