Tuesday, January 16, 2018

Docker Registry on Cloudian S3

This guide shows how to create a local docker registry on internal Cloudian S3 Storage.

Create a s3-registry-config.yaml config file for docker registry such as the sample config file below.

  • region, regionendpoint, v4auth, secure are critical values to be set exactly the same as the sample for it to work with Element's internal Cloudian S3.
  • By setting rootdirectory to / it will store the data under the following path in S3 bucket: $bucket/docker/registry/v2/
  • version: 0.1
    log:
      fields:
        service: registry
    storage:
      s3:
        accesskey: **************************
        secretkey: **************************************************************
        region: region1
        regionendpoint: http://s3internal.servername
        bucket: dockerreg
        rootdirectory: /
        v4auth: true
        secure: false
      cache:
        blobdescriptor: inmemory
    http:
      addr: :5000
      headers:
        X-Content-Type-Options: [nosniff]
    health:
      storagedriver:
        enabled: true
        interval: 10s
        threshold: 3
    

    start a registry container, note please run the following command from the path where the above configuration file located.

    docker run -d -p 5000:5000 --restart=always --name registry -v `pwd`/s3-registry-config.yaml:/etc/docker/registry/config.yml registry:2
    

    test the local docker registry

    # Pull the ubuntu:16.04 image from Docker Hub.
    docker pull ubuntu:16.04
    
    # Tag the image as localhost:5000/my-ubuntu. This creates an additional tag for the existing image. When the 
    # first part of the tag is a hostname and port, Docker interprets this as the location of a registry, when 
    # pushing.
    docker tag ubuntu:16.04 localhost:5000/my-ubuntu
    
    # Push the image to the local registry running at localhost:5000:
    docker push localhost:5000/my-ubuntu
    
    # Remove the locally-cached ubuntu:16.04 and localhost:5000/my-ubuntu images, so that you can test pulling the # image from your registry. This does not remove the localhost:5000/my-ubuntu image from your registry.
    docker image remove ubuntu:16.04
    docker image remove localhost:5000/my-ubuntu
    
    # Pull the localhost:5000/my-ubuntu image from your local registry.
    docker pull localhost:5000/my-ubuntu
    
    # Validate image data are in the local registry
    aws s3 ls --recursive s3://dockerreg/ --endpoint-url http://s3internal.servername

Sunday, January 14, 2018

Hadoop HDFS cluster

start master

as cassandra user:
/opt/hadoop/sbin/start-all.sh
as root user:
su - cassandra -c "/opt/hadoop/sbin/start-all.sh"

stop master

as cassandra user:
/opt/hadoop/sbin/stop-all.sh

start data node

as cassandra user:
/opt/hadoop/sbin/hadoop-daemon.sh start datanode
/opt/hadoop/sbin/start-balancer.sh
as root user:
su - cassandra -c "/opt/hadoop/sbin/hadoop-daemon.sh start datanode"
su - cassandra -c "/opt/hadoop/sbin/start-balancer.sh"

stop data node

as cassandra user:
/opt/hadoop/sbin/hadoop-daemon.sh stop datanode

Tuesday, December 12, 2017

Derby Network Server & supervisord integration

Download & Install Derby

download Derby For Java 8 and Higher, such as derby-10.14.1.0 untar derby distribution under /opt/derby [https://db.apache.org/derby/derby_downloads.html]

Integrate with supervisord

cat >> /etc/supervisord.d/derby.ini << EOF
[program:derby]
command=/opt/derby/bin/startNetworkServer -h 0.0.0.0 ; the program (relative uses PATH, can take args)
process_name=%(program_name)s  ; process_name expr (default %(program_name)s)
numprocs=1                     ; number of processes copies to start (def 1)
directory=/opt/derby           ; directory to cwd to before exec (def no cwd)
priority=1                     ; the relative start priority (default 999)
startsecs=10                    ; number of secs prog must stay running (def. 1)
user=spark                     ; setuid to this UNIX account to run the program
redirect_stderr=true          ; redirect proc stderr to stdout (default false)
stdout_logfile=/var/log/%(program_name)s-stdout.log        ; stdout log path, NONE for none; default AUTO
EOF
supervisorctl reread
supervisorctl update

Friday, December 8, 2017

Konga Installation

Konga

Installation

Install npm and node.js.

sudo yum install nodejs npm

Install bowergulp and sails packages.

npm install bower gulp sails -g
npm run bower-deps
mkdir -p /opt/konga
cd /opt
git clone https://github.com/pantsel/konga.git
cd konga
npm install
cd config
cp -p local_example.js local.js

if Kong is running on a remote server and/or different default port, the following needs to be changed in local.js

kong_admin_url : process.env.KONG_ADMIN_URL || 'http://127.0.0.1:8001'
To test Kong admin URL is up running and accessible: curl http://127.0.0.1:8001

Running Konga

Development

cd /opt/konga
nohup npm start &
Konga GUI will be available at http://hostname:1338

Production

cd /opt/konga
nohup npm run production &
Konga GUI will be available at http://hostname:1338

Kong Installation

The following instruction is for CentOS7 only

Install Kong

rpm -ivh https://bintray.com/kong/kong-community-edition-rpm/download_file?file_path=dists/kong-community-edition-0.11.1.el7.noarch.rpm
yum install epel-release
yum install kong-community-edition

Install & prepare Database

Kong needs a database and supports both PostgreSQL 9.4+ and Cassandra 3.x.x as its datastore. In our POC we have tested it with Postgres 9.6
yum install https://download.postgresql.org/pub/repos/yum/9.6/redhat/rhel-7-x86_64/pgdg-centos96-9.6-3.noarch.rpm
yum install postgresql96-server.x86_64
yum install postgresql96
/usr/pgsql-9.6/bin/postgresql96-setup initdb
systemctl enable postgresql-9.6
systemctl start postgresql-9.6

su - postgres
psql
CREATE USER kong; CREATE DATABASE kong OWNER kong;
Now, run the Kong migrations:
sudo kong migrations up [-c /path/to/kong.conf]

Run Kong on port 80 for HTTP

Edit /etc/kong/kong.conf Change: proxy_listen = 0.0.0.0:8000 To: proxy_listen = 0.0.0.0:80
sed -i "s/proxy_listen = 0.0.0.0:8000/proxy_listen = 0.0.0.0:80/" /etc/kong/kong.conf

Start Kong

kong start

Increase nofile

Default file descriptor in CentOS7 is set to 1024, to increase this value for more connections to Kong:
cat >>  etc/security/limits.conf < EOF
*       soft    nofile  32767
*       hard    nofile  32767
EOF

Friday, November 10, 2017

JupyterHub & Spark Setup

Install Java

Java is required to run pyspark

Docker

Docker container is on Debian jessie, jessie-backports needs to be added for OpenJDK
echo "deb http://cdn-fastly.deb.debian.org/debian jessie-backports main" >> /etc/apt/sources.list
apt-get update -y
Please note the -t jessie-backports option
apt-get install -t jessie-backports openjdk-8-jdk -y

Download & Install Spark

download spark 2.2.0 pre-built for hadoop 2.7 [https://www.apache.org/dyn/closer.lua/spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz] untar spark distribution under /opt/spark

Configure Spark

configure the following environment variables in /opt/spark/conf/spark-env.sh For example:
SPARK_LOCAL_IP=192.168.1.11
SPARK_WORKER_MEMORY=4g
SPARK_WORKER_CORES=2
Make the default spark rdd directory group writable
mkdir -p /var/lib/spark/rdd
chown -R spark: spark
chmod g+w -R spark

chmod group read/writable on a deep sub-directory under home

On rare cases where you need to provide group read and writable access from a multi-level sub-directory under your home, not that it's recommended from security perspective.  Note when making your home directory readable by group you will have problem with passwordless ssh.

cd ; p="/home/runwuf/project/test/p1/runwuf" ; while [ $p != "/home"  ] ; do chmod g+rw $p ; p=`echo $p | rev | cut -f2- -d"/" | rev` ; done