Hey folks, I was excited to see the new Element84 Sentinel-2 STAC catalogue which now includes L1C data: https://www.element84.com/earth-search/. I was following this guide on loading data from the previous (v0) catalogue which works great, but when I upgrade this to try and work with the new v1 catalogue I’m getting Access Denied issues on compute. I’ve tried various AWS configs (configure_rio) but without joy so wondering if anyone else has taken a look yet!?
This is likely because all of the asset URLs are incorrect for the L1C Items – the bucket was accidentally set to the sentinel-s2-l2a bucket instead of the sentinel-s2-l1c bucket. For example, this asset href on the first result I got: “s3://sentinel-s2-l2a/tiles/33/X/WB/2023/6/8/0/B02.jp2” should be “s3://sentinel-s2-l1c/tiles/33/X/WB/2023/6/8/0/B02.jp2”. The workaround is to rewrite the URL to fix the bucket name if you can. We’re working on fixing this, but it’s likely several weeks out at least.
Here’s an issue that is likely part of the OP’s problem: Collection sentinel-2-l2c items have asset hrefs that reference sentinel-s2-l2a bucket · Issue #3 · Element84/earth-search · GitHub
(note: @philvarner pointed me to this issue, so I just thought I’d link it in to this conversation as well)
Ah gotcha, yep that looks to be the likely culprit! Thanks for that. Interesting that the earthsearch:s3_path is still listed as l1c though?
[EDIT] I’m not in much of a rush on this so I could just wait for the change and maybe play with L2A instead for now, but I did try and access with a simple replace string approach but still getting access denied:
import rasterio
with rasterio.open(S2_items[0][‘assets’][‘red’][‘href’].replace(“l2a”, “l1c” )) as dataset:
rasterio.plot.show(dataset)
Hi @akpetty! Here’s some code that might help. Some of my team members were working on L1C data recently (you should know Lilly ) and was facing the same issue, so you’re in luck!
First, import some libraries
import os
import pystac_client
import rioxarray
import stackstac
Next, set up the STAC query and patch the href URLs of the STAC assets. This is an example using stackstac:
client = pystac_client.Client.open(url="https://earth-search.aws.element84.com/v1/")
search = client.search(
collections="sentinel-2-l1c",
bbox=[-20.7, 64.5, -19.5, 64.8], # xmin, ymin, xmax, ymax
datetime="2023-02-01/2023-02-28",
)
stac_items = search.items()
stac_item = next(stac_items) # <Item id=S2B_27WWM_20230228_0_L1C>
for stac_asset in stac_item.assets.values():
stac_asset.href = stac_asset.href.replace(
"s3://sentinel-s2-l2a/", "s3://sentinel-s2-l1c/"
)
At this point, you should be able to read the metadata (even without authentication).
dataarray = stackstac.stack(items=stac_item, dtype="float16", resolution=10)
print(dataarray)
produces
<xarray.DataArray 'stackstac-40b6166a40b446f90241b2283a6022d7' (time: 1,
band: 14,
y: 10980,
x: 10980)>
dask.array<fetch_raster_window, shape=(1, 14, 10980, 10980), dtype=float16, chunksize=(1, 1, 1024, 1024), chunktype=numpy.ndarray>
Coordinates: (12/39)
* time (time) datetime64[ns] 2023-02-28T13:03:...
id (time) <U24 'S2B_27WWM_20230228_0_L1C'
* band (band) <U8 'blue' 'cirrus' ... 'visual'
* x (x) float64 5e+05 5e+05 ... 6.098e+05
* y (y) float64 7.2e+06 7.2e+06 ... 7.09e+06
processing:software object {'sentinel2-to-stac': '0.1.0'}
... ...
raster:bands (band) object [{'nodata': 0, 'data_type...
gsd (band) object 10 60 60 10 ... 20 20 None
common_name (band) object 'blue' 'cirrus' ... None
center_wavelength (band) object 0.49 1.3735 ... 2.19 None
full_width_half_max (band) object 0.098 0.075 ... 0.242 None
epsg int64 32627
Attributes:
spec: RasterSpec(epsg=32627, bounds=(499980, 7090200, 609780, 7200...
crs: epsg:32627
transform: | 10.00, 0.00, 499980.00|\n| 0.00,-10.00, 7200000.00|\n| 0.0...
resolution: 10
Now this is where it gets tricky, you’ll need to set up the AWS requester pays somehow. There’s probably a couple of ways, but one way I have it setup is to edit the ~/.aws/credentials
file, and have three lines like this:
[default]
aws_access_key_id = ABCDEFGHIJKLMNOPQRST
aws_secret_access_key = MnOpQrStUvWxYz1a2B3c4D5e6f7G8h9IjKlMnOpQ
Now you can set some environment variables and plot the Sentinel L1C data:
os.environ["AWS_REQUEST_PAYER"] = "requester"
os.environ["AWS_PROFILE"] = "default"
da_rgb = dataarray.sel(band=["red", "green", "blue"]).squeeze()[:, :100, :100] # get subset
da_rgb.astype("int").plot.imshow(rgb="band", robust=True)
produces
Notes:
- I had this running on the CryoCloud Hub at AWS us-west-2, might need to
pip install stackstac
first if running there too. - According to https://element84.com/blog/introducing-earth-search-v1-new-datasets-now-available, the
sentinel-s2-l1c
collection is now namedsentinel-2-l1c
, but I couldn’t get the new one to work for some reason, gettingRuntimeError: Error opening 's3://sentinel-2-l1c/tiles/27/W/WM/2023/2/28/0/B04.jp2': RasterioIOError("'/vsis3/sentinel-2-l1c/tiles/27/W/WM/2023/2/28/0/B04.jp2' does not exist in the file system, and is not recognized as a supported dataset name.")
. Might be some delay in the renaming?
Thanks @weiji14, I had a feeling I should have reached out to you directly! I copied exactly your example and am also running on CryoCloud but unfortunately I am still getting Access Denied issues, so maybe there’s something else going on my end. It seems like my credentials are being used now at the very least (when I tried a fake ID I got a different error!) so I’ll keep exploring…
Yeah, we should catch up sometime Anyways, on the Access Denied part, I think the CryoCloud Hub has set up requester pays for the USGS Landsat S3 bucket (see Requester pays fix needed · Issue #52 · CryoInTheCloud/hub-image · GitHub), but you may need a different access key (i.e. your personal or institutional one) for this particular Element84 bucket. The documentation around this isn’t particularly good, I found Downloading objects in Requester Pays buckets - Amazon Simple Storage Service which might be helpful if you want to test things out on the CLI first though if you want to make sure that your credentials work.
OK finally got it working hurrah, had to create and use a new personal access key instead of the access key created on my institutional account, so maybe our NASA admins have blocked requester_pay buckets (or I messed up something, I’ve pinged them an email). Anyway thanks so much for your help @weiji14, couldn’t have got there without that input. Excited to show you what I’m working on once I’ve polished it up.
Definitely agree people have created some amazing resources out there to explain the stac side of all this, but that final AWS link is pretty unclear.