搭建hadoop分布式环境
Hadoop分布式安装
说明
- 本文利用三台机器,均安装centos6
- 除了最后启动和停止,所有操作均要在三台机器上做
- 有些配置文件可以先在一台机器上完成配置,然后通过scp发送到另外的机器,以减少工作量
1.修改主机名及映射
[root@hadoop1 ~]# vim /etc/sysconfig/network

[root@hadoop1 ~]# vim /etc/hosts

2.创建Hadoop用户
[root@hadoop3 ~]# useradd hadoop [root@hadoop2 ~]# passwd hadoop [root@hadoop3 ~]# vim /etc/sudoers

3.免密登陆
切换到hadoop用户 [hadoop@hadoop1 ~]$ ssh-keygen
传输密钥 [hadoop@hadoop1 ~]$ ssh-copy-id hadoop1 [hadoop@hadoop1 ~]$ ssh-copy-id hadoop2 [hadoop@hadoop1 ~]$ ssh-copy-id hadoop3

验证ssh无需密码 [hadoop@hadoop1 ~]$ ssh hadoop1 [hadoop@hadoop1 ~]$ ssh hadoop2 [hadoop@hadoop1 ~]$ ssh hadoop3
4.安装JDK
[hadoop@hadoop1 ~]$ tar -zxvf jdk-8u161-linux-x64.tar.gz [hadoop@hadoop1 ~]$ mv jdk1.8.0_161 jdk8
切换到root [root@hadoop1 ~]# vim /etc/profile 在文件尾加上 #Java export JAVA_HOME=/home/hadoop/jdk8 export PATH=$PATH:$JAVA_HOME/bin 切换到hadoop [hadoop@hadoop1 ~]$ source /etc/profile 验证 [hadoop@hadoop1 ~]$ java -version

5.安装hadoop
1.分布式系统规划
| hdfs | yarn | |
|---|---|---|
| hadoop1 | namenode、datanode | nodemanager |
| hadoop2 | datanode、secondarynamenode | nodemanager |
| hadoop3 | datanode | resourcemanager、nodemanager |
[hadoop@hadoop1 ~]$ wget https://www-us.apache.org/dist/hadoop/common/hadoop-2.7.6/hadoop-2.7.6.tar.gz [hadoop@hadoop1 ~]$ tar -zxvf hadoop-2.7.6.tar.gz [root@hadoop1 ~]# vim /etc/profile [hadoop@hadoop1 ~]$ source /etc/profile 验证 [hadoop@hadoop1 ~]$ hadoop version
2.配置文件
1.hadoop环境变量配置
#Java export JAVA_HOME=/home/hadoop/jdk8 export HADOOP_HOME=/home/hadoop/hadoop-2.7.6 export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
2.修改配置文件
配置文件目录
hadoop-2.7.6/etc/hadoop/
[hadoop@hadoop3 ~]$ cd hadoop-2.7.6/etc/hadoop/
1.hadoop-env.sh

2.core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoopdata/tmp</value>
</property>
</configuration>3.hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop2:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/hadoopdata/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/hadoopdata/data</value>
</property>
</configuration>4.yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop3</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>5.mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop1:19888</value>
</property>
</configuration>6.slaves
hadoop1 hadoop2 hadoop3
3.格式化
[hadoop@hadoop1 hadoop-2.7.6]$ hadoop namenode -format
4.启动与停止
[hadoop@hadoop1 hadoop-2.7.6]$ start-dfs.sh
注意,yarn最好在resourcemanager节点上启动
[hadoop@hadoop3 hadoop-2.7.6]$ start-yarn.sh
6.时间同步
[root@hadoop1 ~]# date Thu Jan 17 00:10:16 EST 2019 //国外vps [root@hadoop1 ~]# rm -rf /etc/localtime [root@hadoop1 ~]# ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime [root@hadoop1 ~]# yum -y install ntpdate ntp [root@hadoop1 ~]# ntpdate time.google.com 17 Jan 13:11:43 ntpdate[1691]: step time server 216.239.35.0 offset 2.120207 sec
相关推荐
WeiHHH 2020-09-23
Aleks 2020-08-19
WeiHHH 2020-08-17
飞鸿踏雪0 2020-07-26
tomli 2020-07-26
deyu 2020-07-21
strongyoung 2020-07-19
eternityzzy 2020-07-19
Elmo 2020-07-19
飞鸿踏雪0 2020-07-09
飞鸿踏雪0 2020-07-04
xieting 2020-07-04
WeiHHH 2020-06-28
genshengxiao 2020-06-26
Hhanwen 2020-06-25