Tuesday, August 7, 2018

JupyterHub Setup (CentOS7)


We are going to use Anaconda to manage Python & jupyterhub packages.

Install Anaconda

wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh
sudo bash Anaconda3-5.0.1-Linux-x86_64.sh

Install Jupyterhub Prerequisites

sudo yum install epel-release.noarch
sudo apt-get install npm nodejs-legacy
sudo yum install npm nodejs-legacy
sudo npm install -g configurable-http-proxy
We are going to install JupyterHub OAuth package as we will be integrate authentication with GitHub OAuth

Install Jupyterhub

/opt/conda/bin/pip install oauthenticator
/opt/conda/bin/pip install --upgrade notebook
Generate a sample configuration file with setting authorized users and admin jupyterhub --generate-config cat >> /opt/jupyterhub/jupyterhub_config.py << EOF from oauthenticator.github import GitHubOAuthenticator c.JupyterHub.authenticator_class = GitHubOAuthenticator
c.Authenticator.whitelist = {'wuf', 'fwu'}
c.Authenticator.admin_users = {'wuf'}
EOF
Create a startup script, to include all the necessary custom configuration required for our jupyterhub. Note the OAUTH & GITHUB configuration are required to integrate with our Enterprise GitHub mkdir -p /opt/jupyterhub
cat > jupyterhub.sh << EOF
#!/bin/bash
export export PATH="/opt/conda/bin:$PATH"
export OAUTH_CLIENT_ID=********************
export OAUTH_CLIENT_SECRET=*****************************************
export OAUTH_CALLBACK_URL=http://hostname:8000/hub/oauth_callback
export GITHUB_HOST=github.internal.server
export GITHUB_HTTP=true
/opt/conda/bin/jupyterhub --ip hostname -f /opt/jupyterhub/jupyterhub_config.py
EOF
GitHub OAUTH is configured by your internal GitHub site admin
Users needs to be created on local system:
groupadd -g 500 users
useradd -m -s /bin/bash -u 1000 -g 500 wuf
useradd -m -s /bin/bash -u 1001 -g 500 fwu
jupyterhub kernel can be listed and validated by running the following command, by default the kernels are loaded from /opt/conda/share/jupyter/kernels and /usr/local/share/jupyter/kernels
jupyter kernelspec list
We will place all our custom kernels here:
mkdir -p /usr/local/share/jupyter/kernels
Conda/Anaconda come with python 3, we can install python 2.7 if needed.

Install python 2.7 environment

/opt/conda/bin/conda create -n py27 python=2.7 anaconda

activating python 2.7 environment

source activate /opt/conda/envs/py27

Install additional python packages

# install packagesin python3
# using conda 
/opt/conda/bin/conda install pandas numpy matplotlib seaborn requests tabulate six future xgboost
# using pip
/opt/conda/bin/pip install hyperopt

# install packages in python2.7
source activate /opt/conda/envs/py27
# using conda
/opt/conda/envs/py27/bin/conda install pandas numpy matplotlib seaborn requests tabulate six future xgboost
# using pip
/opt/conda/envs/py27/bin/pip install hyperopt

Install jupyterhub notebook extension

/opt/conda/bin/conda install -c conda-forge jupyter_contrib_nbextensions
/opt/conda/bin/jupyter contrib nbextension install --system

/opt/conda/bin/conda clean --all

Install supervisor to manage jupyterhub start/stop

the path commands are required to install supervisor packages by system managed python instead of the python managed by conda sudo yum install supervisor export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin sudo easy_install superlance exec bash
cat > /etc/supervisord.d/jupyterhub.ini << EOF
[program:jupyterhub]
command=/opt/jupyterhub/jupyterhub.sh               ; the program (relative uses PATH, can take args)
process_name=%(program_name)s  ; process_name expr (default %(program_name)s)
numprocs=1                     ; number of processes copies to start (def 1)
directory=/opt/jupyterhub        ; directory to cwd to before exec (def no cwd)
priority=1                     ; the relative start priority (default 999)
startsecs=30                    ; number of secs prog must stay running (def. 1)
redirect_stderr=true          ; redirect proc stderr to stdout (default false)
stdout_logfile=/var/log/%(program_name)s-stdout.log        ; stdout log path, NONE for none; default AUTO

[eventlistener:crashmailbatch]
command=crashmailbatch -t alert@domain.com -f supervisord@%(host_node_name)s -s "jupyterhub crashed on %(host_node_name)s"
events=PROCESS_STATE,TICK_60
EOF

JupyterHub start/stop/restart

supervisorctl start jupyterhub
supervisorctl stop jupyterhub
supervisorctl restart jupyterhub

No comments: