Hadoop集群部署初试
安装步骤:
1.规划机器
2.修改主机名称,配置ssh免登,安装jdk
3.修改配置文件,创建目录
4.启动应用
1.规划机器(centos1作为master)
规划三台机器,一种centos1作为master,其余两台机器作为slaves
10.240.139.101 centos1
10.240.140.20 centos2
10.240.139.72 centos3
centos1安装NameNode SecondNameNode ResourceManager
centos2安装DataNode NodeManager
2.修改主机名称,配置ssh免登,安装jdk
[root@centos1 bin]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=centos1
NTPSERVERARGS=iburst
[root@centos1 bin]# vi /etc/hosts
10.240.139.101 centos1
10.240.140.20 centos2
10.240.139.72 centos3
配置SSH免登,成功之后ssh localhost, ssh centos2不用用户名密码即成功。
#登录到centos1 cd ~/.ssh rm ./id_rsa* #删除之前的key ssh-keygen -t rsa #生成新的key,一直回车即可 cat ./id_rsa.pub >> ./authorized_keys scp authorized_keys root@centos2:~/.ssh/authorized_keys_from_centos1 #登录到centos2 cat authorized_keys_from_centos1 >> ./authorized_keys
关闭防火墙:
sudo service iptables stop # 关闭防火墙服务 sudo chkconfig iptables off # 禁止防火墙开机自启,就不用手动关闭了
JDK安装就滤过了。
3.修改配置文件
hadoop-env.sh:
hadoop的环境变量配置文件,需要配置JAVA_HOME的变量
yarn-env.sh:
yarn的环境配置文件,需要配置JAVA_HOME的变量
core-site.xml:
hadoop的全局默认参数配置
hdfs-site.xml:
hdfs的参数配置
yarn-site.xml:
yarn的参数配置
mapred-site.xml:
mapred的参数配置
slaves:
从节点配置
core-site.xml文件
<configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/hadoop/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>fs.defaultFS</name> <value>hdfs://centos1:9000</value> </property> </configuration>
hdfs-site.xml
<configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>centos1:50090</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/data</value> </property> </configuration>
slaves
centos2 centos3
yarn-site.xml
<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>centos1</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>centos1:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>centos1:19888</value> </property> </configuration>
创建目录:/usr/local/hadoop/tmp/dfs/name /usr/local/hadoop/tmp/dfs/data
4.启动应用
./bin/hdfs namenode -format
./sbin/start-dfs.sh centos1上可以看到NameNode SecondaryNameNode centos2,3上看到datanode
./start-yarn.sh centos1上会看到ResourceManager centos2,3上会看到NodeManager进程
查看hdfs相关:http://10.240.139.101:50090/
查看yarn相关:http://10.240.139.72:8042/node/node