Hi all,
I need someone’s help to understand what is going wrong in my workflow!
I am trying to read Sentinel-2 data through the CDSE STAC API and stackstac pyhton library and applying a workflow for snow classification as done here
S3_ENDPOINT = “eodata.dataspace.copernicus.eu”
ACCESS_KEY = “”
SECRET_KEY = “”import os
os.environ[“AWS_S3_ENDPOINT”] = S3_ENDPOINT
os.environ[“AWS_ACCESS_KEY_ID”] = ACCESS_KEY
os.environ[“AWS_SECRET_ACCESS_KEY”] = SECRET_KEYfrom shapely.geometry import shape
from shapely.geometry.polygon import Polygongeometry = {‘type’: ‘Polygon’,
‘coordinates’: [[[-56.055536, -12.63809],
[-56.055536, -12.523493],
[-55.88178, -12.523493],
[-55.88178, -12.63809],
[-56.055536, -12.63809]]]}bounds = shape(geometry).bounds
import pystac_client
CDSE_URL = “https://stac.dataspace.copernicus.eu/v1”
cat = pystac_client.Client.open(CDSE_URL)
cat.add_conforms_to(“ITEM_SEARCH”)start_dt = “2025-07-01”
end_dt = “2025-07-30”from shapely import to_geojson
import jsonparams = {
“collections”: [“sentinel-2-l1c”],
“intersects”: geometry,
“datetime”: f"{start_dt}T00:00:00Z/{end_dt}T23:59:59Z"
}items = list(cat.search(**params).items_as_dicts())
print(f"Number of STAC items returned: {len(items)}")import rioxarray
import stackstacstack = stackstac.stack(
items=items,
resolution=(0.00025, 0.00025),
bounds_latlon=bounds,
epsg=4326,
gdal_env=stackstac.DEFAULT_GDAL_ENV.updated(
{
“GDAL_NUM_THREADS”: -1,
“GDAL_HTTP_UNSAFESSL”: “YES”,
“GDAL_HTTP_TCP_KEEPALIVE”: “YES”,
“AWS_VIRTUAL_HOSTING”: “FALSE”,
“AWS_HTTPS”: “YES”,
}
),
)stack.load()
The code is working well but when applying on a long time-series (the idea is to run it for all the Sentinel-2 era) it turned out that I started to get errors like this when loading the lazy dataset into memory. A random example here:
RuntimeError: Error reading Window(col_off=1024, row_off=1024, width=226, height=676) from ‘s3://eodata/Sentinel-2/MSI/L1C_N0500/2023/01/03/S2B_MSIL1C_20230103T143729_N0510_R096_T19HCB_20240811T114613.SAFE/GRANULE/L1C_T19HCB_A030438_20230103T144503/IMG_DATA/T19HCB_20230103T143729_B08.jp2’: RasterioIOError(‘Read failed. See previous exception for details.’)
see opened issue in the CDSE forum https://forum.dataspace.copernicus.eu/t/stac-api-rasterioioerror/4884
This appears on random dates. I have inserted a loop to retry the data loading and after some retrials, sometimes works, sometimes not and I have the impression the more dates I process, the more frequent is the error. So I suspect an issue linked to some access limitations.
If someone could give some some hints, I would really appreciate!!
Thanks in advance
Valentina