Prediction(5)Cluster Trouble Shooting
Prediction(5)ClusterTroubleShooting
Facetosomeissueonlocalzeppelinwithspark1.5.0,zeppelin0.6.0,hadoop2.7.1.
Itmaybememoryissue.ButitdoesnothelpIconfigurethezeppelinasfollow:
exportMASTER="yarn-client"
exportHADOOP_CONF_DIR="/opt/hadoop/etc/hadoop/"
exportSPARK_HOME="/opt/spark"
.${SPARK_HOME}/conf/spark-env.sh
exportZEPPELIN_CLASSPATH="${SPARK_CLASSPATH}"
exportZEPPELIN_JAVA_OPTS="-Dspark.yarn.driver.memoryOverhead=512-Dspark.yarn.executor.memoryOverhead=512-Dspark.akka.frameSize=100-Dspark.executor.instances=2-Dspark.driver.memory=3g-Dspark.storage.memoryFraction=0.7-Dspark.core.connection.ack.wait.timeout=800-Dspark.rdd.compress=true-Dspark.default.parallelism=18-Dspark.executor.memory=3g"
Sparkasfollow:
exportSPARK_DAEMON_JAVA_OPTS="-verbose:gc-XX:+PrintGCDetails-XX:+PrintGCDateStamps-XX:+UseConcMarkSweepGC-XX:CMSInitiatingOccupancyFraction=70-XX:MaxHeapFreeRatio=70"
exportHADOOP_CONF_DIR="/opt/hadoop/etc/hadoop"
#exportSPARK_WORKER_MEMORY=1024m
#exportSPARK_JAVA_OPTS="-Dbuild.env=lmm.sparkvm"
exportUSER=carl
Installphantomjs
>sudoapt-getinstallphantomjs
BuildZeppelinAgain
>mvncleanpackage-Pspark-1.5-Dspark.version=1.5.0-Dhadoop.version=2.7.1-Phadoop-2.6-Pyarn-DskipTests
Ijustsetupubuntu-pilottorunthezeppelinandspark.ubuntu-masterandubuntu-dev1,ubuntu-dev2willbetheyarncluster.Everythingworksfinenow.
References:
http://machinelearningmastery.com/non-linear-classification-in-r-with-decision-trees/