Monday, June 25, 2018

Jupyter Kernel & DataStax Enterprise Spark Integration

Install DSE from tarball
tar xpvf dse-4.8.15-bin.tar.gz
These hadoop config files needs to be copied from the target DSE environment
$DSE_HOME/resources/hadoop/conf
dse-mapred-default.xml
dse-core-default.xml
Create the following temp work directories for spark
Note all users use the notebook must have write permission to the var/lib/spark, group writable permission is recommended
mkdir -p /opt/dse4/var/lib/spark/worker
mkdir -p /opt/dse4/var/lib/spark/rdd
chown -R root:fleet /opt/dse4/var/lib/spark
chmod g+w -R /opt/dse4/var/lib/spark
The following environment variables needs to be set to the DSE home
/opt/dse4/resources/spark/conf/spark-env.sh
export SPARK_WORKER_DIR="/opt/dse4/var/lib/spark/worker"
export SPARK_LOCAL_DIRS="/opt/dse4/var/lib/spark/rdd"

No comments: