It looks like September slipped by us rather quickly, so I’d propose we set aside an hour or so the first week of October to sketch out a roadmap and some issues.
Anyone who wants to participate is welcome!
Please fill out this doodle pole and I’ll schedule a meeting next week:
Based on the responses to the above doodle poll, I set up a road mapping discussion for this Friday, 12:00 Eastern. Everyone’s welcome! Feel free to ping me for the zoom info
Clearly I missed the boat on this (I’m having trouble keeping up with so many communications channels - any tips?). Has the conversation moved elsewhere? I’d definitely like to stay plugged in to this conversation.
@JessicaS11 the brief summary from the call (notes here) is we developed these rough milestones for edlfs:
v0.1.0
provide an HTTP fsspec backend that handles EDL outside of us-west-2 so we don’t have to deal with the S3 signed redirect yet
v0.2.0
extend the HTTP fsspec backend to handle the S3 signed url redirect inside of us-west-2
Ensure HTTP backend works distributed (e.g., Dask)
v0.3.0
S3 endpoints for all (some?) DAACs to give temp access keys
v0.4.0
Dask plugin to handle re-authing S3 credentials in distributed workloads
We decided to use the #nasa-edl channel on OpenScapes slack to have chat-type conversations about edlfs. Since I see you’re already there, I added you to that channel. (Anyone else reading, let me know if you want an invite!)
I’ll turn those into GitHub milestones, and we agreed to start sketching out issues for anyone to pick up and work on. We’re all busy, so it might move slowly initially, but it’d be nice to be around v0.3.0 by Feb.
I’m having trouble keeping up with so many communications channels - any tips?
I haven’t figured this out either! Might worth it’s own post/thread here tbh.
Can I get an invite to the OpenScapes Slack too please? Been working with Jessica on some Earthdata authentication to NASA hosted data on AWS S3 and would like to know more about what’s happening on that front!
P.S. Thanks @sharkinsspatial for pointing to this during the Pangeo meeting
@betolink , @briannapagan , and @yuvipanda did a bunch of work tracking down various issues, and generally, if the DAACs upgrade to the latest versions of distribution software (apache2-urs-module, TEA, Cumulous), and aiohttp releases a fix, everything should work as expected in the fsspec world.
So, overall, it’s not clear to me if we still need an edlfs package, or if things have been/will be resolved upstream. I’ve been meaning to sit down with @betolink and decide if we should abstract out the auth handling in earthaccess into edlfs, or if we should update the README and archive GitHub - NASA-Openscapes/edlfs: POC to test access patterns to NASA datasets.
Like @jhkennedy mentioned, thanks to @yuvipanda, @briannapagan and many more I think we have handle on how to address the different access patterns and NASA is moving towards a consistent way of authenticating across the different DAACs and AWS. I don’t think EDLFS would be a total waste of time but it’s not urgent, specially if all the DAACs adopt SSO with bearer tokens (seems that it’s going happen soon). We still have the annoyance of the DAAC-specific temporary credentials and other user experience issues like EULAs and App approvals…mmm maybe we do need some abstraction for all these things.