spark 2.3.1 Standalone 集群

1.先下载spark 2.3.1

下载地址:http://spark.apache.org/downloads.html

2.安装spark 2.3.1

   上传到 /usr/spark 目录下

   解压安装 : 

  

tar -zxvf spark-2.3.1-bin-hadoop2.7.tgz

 3.修改/etc/hosts文件如下:

vim /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.2.185 sky1

 修改/etc/sysconfig/network文件如下:

vim /etc/sysconfig/network

NETWORKING=yes
HOSTNAME=sky1
GATEWAY=192.168.2.1

4. 修改spark 配置文件(以4台机器为例)

   conf/slaves

vim conf/slaves
sky1
sky2
sky3
sky4

   conf/spark-env.sh

vim conf/spark-env.sh
export JAVA_HOME=/usr/java/jdk
export SPARK_MASTER_HOST=sky1
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=1
export SPARK_WORKER_MEMORY=1g

5.修改完成后,把spark cp 到其它机器

scp -r /usr/spark/spark-2.3.1-bin-hadoop2.7 root@sky2:/usr/spark

6.启动spark

启动注意关闭防火墙(service iptables stop)

./sbin/start-all.sh

其它启动命令(http://spark.apache.org/docs/latest/spark-standalone.html):

sbin/start-master.sh - Starts a master instance on the machine the script is executed on.
sbin/start-slaves.sh - Starts a slave instance on each machine specified in the conf/slaves file.
sbin/start-slave.sh - Starts a slave instance on the machine the script is executed on.
sbin/start-all.sh - Starts both a master and a number of slaves as described above.
sbin/stop-master.sh - Stops the master that was started via the sbin/start-master.sh script.
sbin/stop-slaves.sh - Stops all slave instances on the machines specified in the conf/slaves file.
sbin/stop-all.sh - Stops both the master and the slaves as described above.

  7.查看启动情况:

    http://IP:8080/ 查看spark web控制台

   netstat -antlp :查看spark 端口监听情况

   

  8. 测试(http://spark.apache.org/docs/latest/submitting-applications.html)

     ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://sky1:7077 examples/jars/spark-examples_2.11-2.3.1.jar  10000

   其它

# Run application locally on 8 cores
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master local[8] \
  /path/to/examples.jar \
  100

# Run on a Spark standalone cluster in client deploy mode
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master spark://207.184.161.138:7077 \
  --executor-memory 20G \
  --total-executor-cores 100 \
  /path/to/examples.jar \
  1000

# Run on a Spark standalone cluster in cluster deploy mode with supervise
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master spark://207.184.161.138:7077 \
  --deploy-mode cluster \
  --supervise \
  --executor-memory 20G \
  --total-executor-cores 100 \
  /path/to/examples.jar \
  1000

# Run on a YARN cluster
export HADOOP_CONF_DIR=XXX
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master yarn \
  --deploy-mode cluster \  # can be client for client mode
  --executor-memory 20G \
  --num-executors 50 \
  /path/to/examples.jar \
  1000

# Run a Python application on a Spark standalone cluster
./bin/spark-submit \
  --master spark://207.184.161.138:7077 \
  examples/src/main/python/pi.py \
  1000

# Run on a Mesos cluster in cluster deploy mode with supervise
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master mesos://207.184.161.138:7077 \
  --deploy-mode cluster \
  --supervise \
  --executor-memory 20G \
  --total-executor-cores 100 \
  http://path/to/examples.jar \
  1000

# Run on a Kubernetes cluster in cluster deploy mode
./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master k8s://xx.yy.zz.ww:443 \
  --deploy-mode cluster \
  --executor-memory 20G \
  --num-executors 50 \
  http://path/to/examples.jar \
  1000