Skip to content

Commit

Permalink
Merge pull request #557 from jupyter-on-openshift/source-to-image
Browse files Browse the repository at this point in the history
Sample builder scripts for build custom notebook images using Source-to-Image.
  • Loading branch information
minrk authored Feb 20, 2018
2 parents 65f1c4b + 9850b57 commit 2e3a596
Show file tree
Hide file tree
Showing 5 changed files with 588 additions and 0 deletions.
165 changes: 165 additions & 0 deletions examples/source-to-image/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
This example provides scripts for building custom Jupyter Notebook images containing notebooks, data files, and with Python packages required by the notebooks already installed. The scripts provided work with the Source-to-Image tool and you can create the images from the command line on your own computer. Templates are also provided to enable running builds in OpenShift, as well as deploying the resulting image to OpenShift to make it available.

The build scripts, when used with the Source-to-Image tool, provide similar capabilities to ``repo2docker``. When builds are run under OpenShift with the supplied templates, it provides similar capabilities to ``mybinder.org``, but where notebook instances are deployed in your existing OpenShift project and JupyterHub is not required.

For separate examples of using JupyterHub with OpenShift, see the project:

* https://github.com/jupyter-on-openshift/jupyterhub-quickstart

Source-to-Image Project
-----------------------

Source-to-Image (S2I) is an open source project which provides a tool for creating container images. It works by taking a base image, injecting additional source code or files into a running container created from the base image, and running a builder script in the container to process the source code or files to prepare the new image.

Details on the S2I tool, and executable binaries for Linux, macOS and Windows, can be found on GitHub at:

* https://github.com/openshift/source-to-image

The tool is standalone, and can be used on any system which provides a docker daemon for running containers. To provide an end-to-end capability to build and deploy applications in containers, support for S2I is also integrated into container platforms such as OpenShift.

Getting Started with S2I
------------------------

As an example of how S2I can be used to create a custom image with a bundled set of notebooks, run:

```
s2i build \
--scripts-url https://raw.githubusercontent.com/jupyter/docker-stacks/master/examples/source-to-image \
--context-dir docs/source/examples/Notebook \
https://github.com/jupyter/notebook \
jupyter/minimal-notebook:latest \
notebook-examples
```

This example command will pull down the Git repository ``https://github.com/jupyter/notebook`` and build the image ``notebook-examples`` using the files contained in the ``docs/source/examples/Notebook`` directory of that Git repository. The base image which the files will be combined with is ``jupyter/minimal-notebook:latest``, but you can specify any of the Jupyter Project ``docker-stacks`` images as the base image.

The resulting image from running the command can be seen by running ``docker images``.

```
REPOSITORY TAG IMAGE ID CREATED SIZE
notebook-examples latest f5899ed1241d 2 minutes ago 2.59GB
```

You can now run the image.

```
$ docker run --rm -p 8888:8888 notebook-examples
Executing the command: jupyter notebook
[I 01:14:50.532 NotebookApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/notebook_cookie_secret
[W 01:14:50.724 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
[I 01:14:50.747 NotebookApp] JupyterLab beta preview extension loaded from /opt/conda/lib/python3.6/site-packages/jupyterlab
[I 01:14:50.747 NotebookApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 01:14:50.754 NotebookApp] Serving notebooks from local directory: /home/jovyan
[I 01:14:50.754 NotebookApp] 0 active kernels
[I 01:14:50.754 NotebookApp] The Jupyter Notebook is running at:
[I 01:14:50.754 NotebookApp] http://[all ip addresses on your system]:8888/?token=04646d5c5e928da75842cd318d4a3c5aa1f942fc5964323a
[I 01:14:50.754 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 01:14:50.755 NotebookApp]
Copy/paste this URL into your browser when you connect for the first time,
to login with a token:
http://localhost:8888/?token=04646d5c5e928da75842cd318d4a3c5aa1f942fc5964323a
```

Open your browser on the URL displayed, and you will find the notebooks from the Git repository and can work with them.

The S2I Builder Scripts
-----------------------

Normally when using S2I, the base image would be S2I enabled and contain the builder scripts needed to prepare the image and define how the application in the image should be run. As the Jupyter Project ``docker-stacks`` images are not S2I enabled (although they could be), in the above example the ``--scripts-url`` option has been used to specify that the example builder scripts contained in this directory of this Git repository should be used.

Using the ``--scripts-url`` option, the builder scripts can be hosted on any HTTP server, or you could also use builder scripts local to your computer file using an appropriate ``file://`` format URI argument to ``--scripts-url``.

The builder scripts in this directory of this repository are ``assemble`` and ``run`` and are provided as examples of what can be done. You can use the scripts as is, or create your own.

The supplied ``assemble`` script performs a few key steps.

The first steps copy files into the location they need to be when the image is run, from the directory where they are initially placed by the ``s2i`` command.

```
cp -Rf /tmp/src/. /home/$NB_USER
rm -rf /tmp/src
```

The next steps are:

```
if [ -f /home/$NB_USER/environment.yml ]; then
conda env update --name root --file /home/$NB_USER/environment.yml
conda clean -tipsy
else
if [ -f /home/$NB_USER/requirements.txt ]; then
pip --no-cache-dir install -r /home/$NB_USER/requirements.txt
fi
fi
```

This determines whether a ``environment.yml`` or ``requirements.txt`` file exists with the files and if so, runs the appropriate package management tool to install any Python packages listed in those files.

This means that so long as a set of notebook files provides one of these files listing what Python packages they need, those packages will be automatically installed into the image so they are available when the image is run.

A final step is:

```
fix-permissions $CONDA_DIR
fix-permissions /home/$NB_USER
```

This fixes up permissions on any new files created by the build. This is necessary to ensure that when the image is run, you can still install additional files. This is important for when an image is run in ``sudo`` mode, or it is hosted in a more secure container platform such as Kubernetes/OpenShift where it will be run as a set user ID that isn't known in advance.

As long as you preserve the first and last set of steps, you can do whatever you want in the ``assemble`` script to install packages, create files etc. Do be aware though that S2I builds do not run as ``root`` and so you cannot install additional system packages. If you need to install additional system packages, use a ``Dockerfile`` and normal ``docker build`` to first create a new custom base image from the Jupyter Project ``docker-stacks`` images, with the extra system packages, and then use that image with the S2I build to combine your notebooks and have Python packages installed.

The ``run`` script in this directory is very simple and just runs the notebook application.

```
exec start-notebook.sh "$@"
```

Integration with OpenShift
--------------------------

The OpenShift platform provides integrated support for S2I type builds. Templates are provided for using the S2I build mechanism with the scripts in this directory. To load the templates run:

```
oc create -f https://raw.githubusercontent.com/jupyter/docker-stacks/master/examples/source-to-image/templates.json
```

This will create the templates:

```
jupyter-notebook-builder
jupyter-notebook-quickstart
```

The templates can be used from the OpenShift web console or command line. This ``README`` is only going to explain deploying from the command line.

To use the OpenShift command line to build into an image, and deploy, the set of notebooks used above, run:

```
oc new-app --template jupyter-notebook-quickstart \
--param APPLICATION_NAME=notebook-examples \
--param GIT_REPOSITORY_URL=https://github.com/jupyter/notebook \
--param CONTEXT_DIR=docs/source/examples/Notebook \
--param BUILDER_IMAGE=jupyter/minimal-notebook:latest \
--param NOTEBOOK_PASSWORD=mypassword
```

You can provide a password using the ``NOTEBOOK_PASSWORD`` parameter. If you don't set that parameter, a password will be generated, with it being displayed by the ``oc new-app`` command.

Once the image has been built, it will be deployed. To see the hostname for accessing the notebook, run ``oc get routes``.

```
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
notebook-examples notebook-examples-jupyter.abcd.pro-us-east-1.openshiftapps.com notebook-examples 8888-tcp edge/Redirect None
```

As the deployment will use a secure connection, the URL for accessing the notebook in this case would be:

```
https://notebook-examples-jupyter.abcd.pro-us-east-1.openshiftapps.com
```

If you only want to build an image but not deploy it, you can use the ``jupyter-notebook-builder`` template. You can then deploy it using the ``jupyter-notebook`` template provided with the [openshift](../openshift) examples directory.

See the ``openshift`` examples directory for further information on customizing configuration for a Jupyter Notebook deployment and deleting a deployment.
40 changes: 40 additions & 0 deletions examples/source-to-image/assemble
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#!/bin/bash

set -x

set -eo pipefail

# Remove any 'environment.yml' or 'requirements.txt' files which may
# have been carried over from the base image so we don't reinstall
# packages which have already been installed. This could occur where
# an S2I build was used to create a new base image with pre-installed
# Python packages, with the new image then subsequently being used as a
# S2I builder base image.

rm -f /home/$NB_USER/environment.yml
rm -f /home/$NB_USER/requirements.txt

# Copy injected files to target directory.

cp -Rf /tmp/src/. /home/$NB_USER

rm -rf /tmp/src

# Install any Python modules. If we find an 'environment.yml' file we
# assume we should use 'conda' to install packages. If 'requirements.txt'
# use 'pip' instead.

if [ -f /home/$NB_USER/environment.yml ]; then
conda env update --name root --file /home/$NB_USER/environment.yml
conda clean -tipsy
else
if [ -f /home/$NB_USER/requirements.txt ]; then
pip --no-cache-dir install -r /home/$NB_USER/requirements.txt
fi
fi

# Fix up permissions on home directory and Python installation so that
# everything is still writable by 'users' group.

fix-permissions $CONDA_DIR
fix-permissions /home/$NB_USER
5 changes: 5 additions & 0 deletions examples/source-to-image/run
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash

# Start up the notebook instance.

exec start-notebook.sh "$@"
3 changes: 3 additions & 0 deletions examples/source-to-image/save-artifacts
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash

tar cf - --files-from /dev/null
Loading

0 comments on commit 2e3a596

Please sign in to comment.