Spark启动流程(Standalone)-分析
1、start-all.sh脚本,实际上执行java -cp Master 和 java -cp Worker
2、Master 启动时首先穿件一个RpcEnv对象,负责管理所有通信逻辑
3、Master 通信RpcEnv对象创建一个Endpoint,Master就是一个Endpoint,Worker可以与其进行通信
4、Worker启动时也是创建一个RpcEnv对象
5、Worker通过RpcEnv对象创建一个Endpoint
6、Worker 通过RpcEnv对象建立到Master的连接 ,获取到一个RpcEndpointRef对象,通过该对象可以与Master通信
7、Worker向Master注册,注册内容包括主机名、端口、CPU core数量、内存数量
8、Master接收到worker的注册,将注册信息维护在内存中的table中,其中还包含了一个到worker的RpcEndpointRef对象引用
9、Master回复Worker已经接收到注册,告知Worker已经注册成功
10、Worker端收到成功注册相应后,开始周期性向Master发送心跳
1、start-master.sh Master 启动脚本分析
start-master.sh
//读取SPARK_HOME if [ -z "${SPARK_HOME}" ]; then export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)" fi # NOTE: This exact class name is matched downstream by SparkSubmit. # Any changes need to be reflected there. CLASS="org.apache.spark.deploy.master.Master" if [[ "" = *--help ]] || [[ "" = *-h ]]; then echo "Usage: ./sbin/start-master.sh [options]" pattern="Usage:" pattern+="\|Using Spark‘s default log4j profile:" pattern+="\|Registered signal handlers for" "${SPARK_HOME}"/bin/spark-class $CLASS --help 2>&1 | grep -v "$pattern" 1>&2 exit 1 fi ORIGINAL_ARGS="" . "${SPARK_HOME}/sbin/spark-config.sh" . "${SPARK_HOME}/bin/load-spark-env.sh" if [ "$SPARK_MASTER_PORT" = "" ]; then SPARK_MASTER_PORT=7077 fi if [ "$SPARK_MASTER_HOST" = "" ]; then case `uname` in (SunOS) SPARK_MASTER_HOST="`/usr/sbin/check-hostname | awk ‘{print $NF}‘`" ;; (*) SPARK_MASTER_HOST="`hostname -f`" ;; esac fi if [ "$SPARK_MASTER_WEBUI_PORT" = "" ]; then SPARK_MASTER_WEBUI_PORT=8080 fi //调用spark-daemon.sh执行 "${SPARK_HOME}/sbin"/spark-daemon.sh start $CLASS 1 46 --host $SPARK_MASTER_HOST --port $SPARK_MASTER_PORT --webui-port $SPARK_MASTER_WEBUI_PORT 47 $ORIGINAL_ARGS
spark-daemon.sh
... execute_command() { if [ -z ${SPARK_NO_DAEMONIZE+set} ]; then # 最终以后台守护进程的方式启动 Master nohup -- "" >> $log 2>&1 < /dev/null & newpid="$!" echo "$newpid" > "$pid" # Poll for up to 5 seconds for the java process to start for i in {1..10} do if [[ $(ps -p "$newpid" -o comm=) =~ "java" ]]; then break fi sleep 0.5 done sleep 2 # Check if the process has died; in that case we‘ll tail the log so the user can see if [[ ! $(ps -p "$newpid" -o comm=) =~ "java" ]]; then echo "failed to launch: " tail -2 "$log" | sed ‘s/^/ /‘ echo "full log in $log" fi else "" fi } ...
启动类
/opt/module/spark-standalone/bin/spark-class org.apache.spark.deploy.master.Master --host hadoop201 --port 7077 --webui-port 8080
bin/spark-class启动命令:
/opt/module/jdk1.8.0_172/bin/java -cp /opt/module/spark-standalone/conf/:/opt/module/spark-standalone/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host hadoop201 --port 7077 --webui-port 8080
相关推荐
Johnson0 2020-07-28
Hhanwen 2020-07-26
zhixingheyitian 2020-07-19
yanqianglifei 2020-07-07
Hhanwen 2020-07-05
Hhanwen 2020-06-25
rongwenbin 2020-06-15
sxyhetao 2020-06-12
hovermenu 2020-06-10
Oeljeklaus 2020-06-10
zhixingheyitian 2020-06-08
Johnson0 2020-06-08
zhixingheyitian 2020-06-01
xclxcl 2020-05-31
Hhanwen 2020-05-29
zhixingheyitian 2020-05-29
Oeljeklaus 2020-05-29