I have some credits from Amazon and I was planning to setup a JupyterHub instance there to support a summer school I’m running next week but I’m hitting my head against EKS.
I’m following the z2jupyterhub guide and have a k8s cluster which appears to be running and thanks to @Theo_McCaie I can connect to it via
kubectl proxy. My services look like so:
(base) aopposxlap45:~ watson-parris$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hub ClusterIP 10.100.46.131 <none> 8081/TCP 5h45m
proxy-api ClusterIP 10.100.59.44 <none> 8001/TCP 5h45m
proxy-public LoadBalancer 10.100.201.8 ac55a6fbb56774579b5752ecfdea35df-1113063612.us-east-2.elb.amazonaws.com 443:30443/TCP,80:32074/TCP 5h45m
And I have the following pods:
(base) aopposxlap45:~ watson-parris$ kubectl get pod
NAME READY STATUS RESTARTS AGE
continuous-image-puller-44z8w 1/1 Running 0 6h32m
continuous-image-puller-74rjc 1/1 Running 0 6h32m
hub-5c44d4ff6f-rw7n4 1/1 Running 0 6h32m
proxy-79c6f8c965-kdj5b 1/1 Running 0 6h32m
user-scheduler-865b49c54-qcm94 1/1 Running 0 6h32m
user-scheduler-865b49c54-rzcgj 1/1 Running 0 6h32m
However if I try to connect to the above external IP I just get a dropped connection. I believe the problem is to do with the EKS node-group load balancer (which it automatically creates) but I can’t figure it out. I’ve checked and double checked the security groups and subnets and they look fine.
Has anyone managed to get this working?
Many thanks in advance
I know that @salvis2 has add success with terraform to prepare a JupyterHub deployment on AWS: https://medium.com/pangeo/terraform-jupyterhub-aws-34f2b725f4fd
Thanks Tom, that does look a useful tool.
As far as I can see though this would only get me the infrastructure (which I already have) and the deployment of JupyterHub would rely on the helm approach I already followed on z2jh, is that right?
Alternatively I could ignore EKS and use
kops directly on EC2, is that easier?
You are correct that Terraform only gets you the infrastructure and you would need Helm to deploy the JupyterHub itself. The tool that we’ve used for deploying the hub itself is hubploy. There’s a guide here for how to set up a repo to use it.
What do you have as your JupyterHub config? Your
kubectl commands suggest the cluster is fine, but you are still having issues. I know sometimes for DNS things, I have to wait a little bit before the connection is established and I can visit the site.
Interesting, thanks. I’ve used this helm chart based on the z2jh documentation. I haven’t tweaked it at all.
Yes, I suspect it’s a networking issue but it’s not a DNS cache problem, it’s very persistent. Do you use an EKS LoadBalancer in your Terraform config?
I don’t really mind how I set it up as I’m only doing it once (hopefully), I just want it up!
Hi @duncanwp - sorry I wasn’t able to help more.
This would be my suggested next steps.
I would try diss-entangle pango/juptyer and AKS issue so:
Create a ‘hello world’ pod and service. Something super simple, maybe from here https://kubernetes.io/docs/tutorials/hello-minikube/#create-a-service (this is for minikube but the idea should be the same)
Use a new namespace for the above. Try get that working first. I suggest you might want a new load balancer.
Also, try following this page https://aws.amazon.com/premiumsupport/knowledge-center/eks-kubernetes-services-cluster/
Start with trying to get some sort of hellow world and the rest (I hope) will fall into place!
Also @duncanwp as upsetting as it is sometimes ‘turning it off and on again’ does seem to work. By which I mean delete EVERYTHING and just start again, not necessarily doing anything different. If you use a new cluster you can do this in parallel without deleting you original one.
Very unsatisfying but sometimes works…
Hi @duncanwp , I recently set up a simple EKS cluster with amazon research credits. One of the first things I ran into was health issues with the node group. I just had to edit the health check to reflect the correct port as reflected in the kubectl get svc. Not sure if that helps you…but that’s what I did to resolve my very first issues.
Hi @aradhakrishnanGFDL, that sounds very promising. I added ports 8081 and 8001 to the load balancer and it didn’t seem to help though. I’ll try pulling it all down and starting again tomorrow, with that step too, as Theo suggested.
Did you follow the same zerotojupyterhub guide that I linked to above? Would you have any time first thing tomorrow your time (assuming you’re at GFDL!) to chat about your setup?
My ports show up as the following after the EXTERNAL-IP when I run kubectl get svc. So, I edited my health check test, to TCP 31744. (yours might be 32074)
Yes, I used the guide you linked as reference, though it was not sufficient.
I am available by email tomorrow and most of next week, but will be available for a chat anytime the week after.
aparna.radhakrishnan at noaa.gov
Many thanks for your kind offer to help @aradhakrishnanGFDL. To be honest I’ve given up with the Kubernetes approach for now and just created one big machine with a ‘Littlest JupyterHub’ installation and that seems to be working for now!
It would be nice to have a scalable Kubernetes cluster one day but at least I have something for the workshop I’m running next week.
Am sorry it hasn’t been the smoothest experience.
You’re most likely running into https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/1758. But I think for a lot of cases, TLJH is the way to go! So hope it works out for you.