Notebooks service

The notebooks service provides an interactive computing environment for every commit in a project’s history to each user that has sufficient access rights.

Amalthea integration

Amalthea is a k8s operator for spawning interactive Jupyter notebooks. Renku uses Amalthea to manage sessions and the notebooks-service extends the standard Amalthea functionality by providing tight integration with GitLab.

The notebooks are provided by the Jupyter Server. A new “named” server is spawned for every unique request. A notebook server launch is initiated by posting a request to the <PLATFORM_URL>/api/notebooks/servers URL. The request needs to contain information about the project, commit, Docker image and resources that are required by the specific server. In addition, if two or more users collaborating on a project use the same URL with the same data passed in the request, each will receive their own notebook server.

By default, a Renku project will include a .gitlab-ci.yml file that contains an image_build stage which creates an image for every push (see the Image builds section below). The notebook spawner looks for this GitLab CI job and if it exists, the spawner waits for the job to complete and then launches a notebook server using that image. If the job does not exist or there is a problem with the image build, a notebook server is launched with the default notebook image as specified in the platform configuration options.

The architecture of this setup is presented in the figure below. Blue ovals represent off-the-shelf services and yellow ovals show heavily customized or custom-built components.

strict digraph architecture { compound=true; newrank=true; graph [fontname="Raleway", nodesep="0.8"]; node [shape="rect", style="filled,rounded", fontname="Raleway"]; edge [fontname="Raleway"] GitLab [fillcolor="lightblue"] Amalthea [fillcolor="lightblue"] "notebook-service" [fillcolor="#f4d142" label="Notebook service"] notebook [label="Jupyter\nserver", shape="rect", fillcolor="#f4d142"] oauth2proxy [label="Oauth2\nProxy", shape="rect", fillcolor="#f4d142"] gitproxy [label="Git\nProxy", shape="rect", fillcolor="#f4d142"] "init-container" [shape="rect", fillcolor="#f4d142"] "notebook-service" -> Amalthea [label=" API"] subgraph cluster_notebook { label="JupyterServer custom resource" style="dashed"; notebook oauth2proxy "init-container" {rank=same; "init-container", notebook, oauth2proxy, gitproxy} } "notebook-service" -> GitLab [label=" repo permissions\n image build status"] Amalthea -> notebook [label=" spawn", lhead=cluster_notebook] gitproxy -> GitLab [label=" git pull/push"] "init-container" -> "GitLab" [label=" git clone"] }

The diagram below illustrates the sequence of events that take place in order to launch a new notebook using the notebook service:

Sequence diagram of notebook launch from the UI via the Notebooks Service.

Image builds

If the Renku repository contains a .gitlab-ci.yml file the GitLab instance that the repository is pushed to will try to execute the commands inside this file. By default, when a renku project is initialized, .gitlab-ci.yml, Dockerfile, and requirements.txt are added to the project. On push to the server, the GitLab runner (if it is configured) will then build the image with the name <gitlab-registry>/<namespace>/<project-name>:<commit-sha>. This way we guarantee that the user will have an image available for every point in the project’s history. In future iterations of these services, the build process will be optimized to avoid superfluous builds and reduce launch latency to improve the user experience.

The image building component interactions are visualized below.

strict digraph architecture { compound=true; newrank=true; graph [fontname="Raleway", nodesep="0.8"]; node [shape="rect", style="filled,rounded", fontname="Raleway"]; edge [fontname="Raleway"] Client [fillcolor="#f4d142"] Storage [fillcolor="lightblue", label="Object store", shape="cylinder"] subgraph cluster_gitlab { label="GitLab components" style="dashed"; Repository [fillcolor="lightblue"] Runner [fillcolor="lightblue"] Registry [fillcolor="lightblue"] Repository -> Runner [label=" start CI job"] Runner -> Registry [label=" push image"] {rank=same; Runner, Registry}; } Client -> Repository [label=" git push"] Registry -> Storage [label=" push/fetch image"] }