First published 8 May 2020

An IAC public cloud deployment of JupyterHub for Kubernetes with SAML SSO

At the start of February, the eResearch office received a support request to help implement and manage a JupyterHub deployment. The request from QUT school of Information Systems was for post-graduate teaching.

The requirements of this deployment included:

  • A public facing JupyterHub site
  • Simple and secure authentication for students
  • Easy to configure Jupyter notebooks flavours
  • Responsive and performance focused Jupyter Notebooks
  • The ability to scale to ~250 concurrent student users

First things first, what is a Jupyter Notebook?

The Jupyter notebook extends the console-based approach to interactive computing in a qualitatively new direction, providing a web-based application suitable for capturing the whole computation process: developing, documenting, and executing code, as well as communicating the results. The Jupyter notebook combines two components:

A web application: a browser-based tool for interactive authoring of documents which combine explanatory text, mathematics, computations and their rich media output.

Notebook documents: a representation of all content visible in the web application, including inputs and outputs of the computations, explanatory text, mathematics, images, and rich media representations of objects. [1]

And the JupyterHub?

JupyterHub is the best way to serve Jupyter notebook for multiple users. It can be used in a classes of students, a corporate data science group or scientific research group. It is a multi-user Hub that spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server. [2]

Implementation overview

Our implementation of JupyterHub is running in AWS. By using a cloud provider we can simplify security and scaling requirements.

Deployment

Overview of setup:

  1. Create a VPC and supporting resources using CloudFormation
  2. Use eksctl to create an autoscaling Kubernetes cluster across multiple availability zones
  3. Configure JupyterHub for Kubernetes and deploy using Helm
  4. Integrate JupyterHub with our internal authentication service so enrolled students can login with their university credentials
  5. Customise our JupyterHub Dockerfile to include our required plugins and tools

Links and Resources

  1. Jupyter NoteBooks
  2. JupyterHub
  3. Zero-to-JupyterHub-with-Kubernetes
  4. JupyterHub SAML Authenticator