An IAC public cloud deployment of JupyterHub for Kubernetes with SAML SSO
At the start of February, the eResearch office received a support request to help implement and manage a JupyterHub deployment. The request from QUT school of Information Systems was for post-graduate teaching.
The requirements of this deployment included:
- A public facing JupyterHub site
- Simple and secure authentication for students
- Easy to configure Jupyter notebooks flavours
- Responsive and performance focused Jupyter Notebooks
- The ability to scale to ~250 concurrent student users
First things first, what is a Jupyter Notebook?
The Jupyter notebook extends the console-based approach to interactive computing in a qualitatively new direction, providing a web-based application suitable for capturing the whole computation process: developing, documenting, and executing code, as well as communicating the results. The Jupyter notebook combines two components:
A web application: a browser-based tool for interactive authoring of documents which combine explanatory text, mathematics, computations and their rich media output.
Notebook documents: a representation of all content visible in the web application, including inputs and outputs of the computations, explanatory text, mathematics, images, and rich media representations of objects. [1]
And the JupyterHub?
JupyterHub is the best way to serve Jupyter notebook for multiple users. It can be used in a classes of students, a corporate data science group or scientific research group. It is a multi-user Hub that spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server. [2]
Implementation overview
Our implementation of JupyterHub is running in AWS. By using a cloud provider we can simplify security and scaling requirements.
Overview of setup:
- Create a VPC and supporting resources using CloudFormation
- Use eksctl to create an autoscaling Kubernetes cluster across multiple availability zones
- Configure JupyterHub for Kubernetes and deploy using Helm
- Integrate JupyterHub with our internal authentication service so enrolled students can login with their university credentials
- Customise our JupyterHub Dockerfile to include our required plugins and tools