Ubuntu 16.04安装 Hadoop 2.8.3 图文教程
环境:Ubuntu 16.04
需要软件:jdk ssh
https://mirrors.tuna.tsinghua.edu.cn/apache/Hadoop/common/
Hadoop 2.8.3
安装 jdk并配置环境变量
安装ssh和rshync,主要设置免密登录
sudo apt-get install ssh
sudo apt-get install rshync
<span class="hljs-variable">sh-keygen -t dsa -<span class="hljs-constant">P <span class="hljs-string" style="color: #2b91af;">'' -f ~<span class="hljs-regexp">/.ssh/id_dsa</span></span></span></span>
<span class="hljs-variable"><span class="hljs-constant"><span class="hljs-string" style="color: #2b91af;"><span class="hljs-regexp"><span class="hljs-variable">cat ~<span class="hljs-regexp">/.ssh/id_dsa.pub >> ~<span class="hljs-regexp">/.ssh/authorized_keys</span></span></span></span></span></span></span>
ssh
安装hadoop
export JAVA_HOME=/usr/local/jdk1.8.0_151
配置yarn-env.sh
<span class="hljs-comment" style="color: #008000;"><span class="hljs-variable"><span class="hljs-constant"><span class="hljs-string" style="color: #2b91af;"><span class="hljs-regexp"><span class="hljs-variable"><span class="hljs-regexp"><span class="hljs-regexp">export JAVA_HOME=/usr/local/jdk1.8.0_151</span></span></span></span></span></span></span></span>
3)配置core-site.xml
添加如下配置:
<configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> <description>HDFS的URI,文件系统://namenode标识:端口号</description> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/hadoop/tmp</value> <description>namenode上本地的hadoop临时文件夹</description> </property> </configuration>
4),配置hdfs-site.xml
添加如下配置
<configuration> <!—hdfs-site.xml--> <property> <name>dfs.name.dir</name> <value>/usr/hadoop/hdfs/name</value> <description>namenode上存储hdfs名字空间元数据 </description> </property> <property> <name>dfs.data.dir</name> <value>/usr/hadoop/hdfs/data</value> <description>datanode上数据块的物理存储位置</description> </property> <property> <name>dfs.replication</name> <value>1</value> <description>副本个数,配置默认是3,应小于datanode机器数量</description> </property> </configuration>
5),配置mapred-site.xml
添加如下配置:
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
6),配置yarn-site.xml
添加如下配置:
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>192.168.241.128:8099</value> </property> </configuration>
4,Hadoop启动
1)格式化namenode
$ bin/hdfs namenode –format
2)启动NameNode 和 DataNode 守护进程
$ sbin/start-dfs.sh
3)启动ResourceManager 和 NodeManager 守护进程
$ sbin/start-yarn.sh
- $ cd ~/.ssh/ # 若没有该目录,请先执行一次ssh localhost
- $ ssh-keygen -t rsa # 会有提示,都按回车就可以
- $ cat id_rsa.pub >> authorized_keys # 加入授权
5,启动验证
1)执行jps命令,有如下进程,说明Hadoop正常启动
# jps 6097 NodeManager 11044 Jps 7497 -- process information unavailable 8256 Worker 5999 ResourceManager 5122 SecondaryNameNode 8106 Master 4836 NameNode 4957 DataNode