Hey all
I’m a researcher at Stockholm University and we want to teach our students climate data analysis with python (xarray, intake, dask etc.) but we always spend so much time on installation. Does anyone have a good tip about login-based solutions that will give you jupyter with a pangeo-suite installed? I used Galaxy once, is that advisable for regular teaching? Our classes are usually 10-20 students at a time. We could in principle pay, but of course the budget isn’t huge for this kind of thing.
Any tips would be greatly appreciated!!
Would your university IT department be willing to host a JupyterHub with student login credentials, and maintain an appropriate python kernel for your course? We’ve been using that type of solution for quite a few years at my university, and the students love the ease of access.
I often use Binder for this type of classes. But it depends on the data volume you want to manipulate (only 2GiB ram per user I think), the number of students, and the duration of the course… And you can have availability problem when using Binder.
Last class I gave, I had access to a GCP account, so I deployed a non secured Dask enables Jupyterhub (using pangeo-docker-images) for the 4 hours module. It’s pretty fast to do when you did it once or twice already (Maybe 15 minutes). It can give access to bigger computing resources and don’t cost that much (about 60€ for 60 students during 3+ hours). Can give you simple recipes to build this.
You can also reach out to @tinaok or @annefou, I think we still have some Pangeo deployment on EGI/EOSC and you might apply to its usage!
Thansk to both for great input!
brian-rose , we do have a local server which we could give temporary login to, but it would be a bit inconvenient (dependent on the one person in charge being available etc) and our IT does not really have these types of solutions unfortunately. It would of course be the easiest!
geynard the problem is I would want the students to have access and be able to work on it throughout the course, and could be that 2GiB is a bit little? I basically want them to be able to analyse monthly data from CMIP historical simulations (but would be good if it was flexible also for other usages). I could test binder, but I feel like it’s not for working on a project over a couple of weeks?
I actually wanted to reach out to you, @annefou!! Long time no see Do you have any tips?
I have taught classes of 10-30 students and managed a jupyterhub on https://www.digitalocean.com/. Depending on your compute needs I’ve paid anywhere between $50-$300/month also having students doing some fairly heavy work. Normally my university reimburses me for this and I would say it’s on average costed $1000 per class. @yuvipanda probably has some good feedback here.
I have taught a course using google colab. While not ideal (agree with the others that a self-hosted JHub is preferred) but the price is right.
Also I think it’s the best access for GPU’s.
Sara, we need more info on the dates, number of participants, and overall needs. It is hard to answer without these details.
Like @brian-rose at University of Washington we have an IT-provided JupyterHub service for classes. In fact the University doesn’t allow Colab because of student privacy concerns! GPUs and Machine Learning FAQ – Information Technology
Which CMIP dataset are you accessing? For not-for-credit classes like summer workshops, we’ve had great experience with 2i2c Usecases and prices — Hub Service Guide as they can spin up cost-effective JupyterHubs in specific data centers (e.g. AWS us-west-2 for working with NASA data)
Also for Xarray SciPy workshops for 50+, GitHub Codespaces has been effective for a batteries included Xarray environment (see Get Started). There are limited resources there - for example you could use Dask LocalCluster but not DaskGateway like we used to have on the Pangeo Hubs. Depending on the course content though, there could be enough compute and persistent disk space? Also, I imagine the monthly CPU limits could be annoying, so I’d be curious if anyone has used GitHub Classroom sponsored Codespaces for a course?
Using GitHub Codespaces with GitHub Classroom - GitHub Docs
Finally there seems to be an ever-changing list of free JupterHubs, but it’s not always clear if these run in data centers next to the data you’re working with, if everyone can access them, if they are reliable enough for regular use, etc.