如何搭建Hadoop集群
准备工作
1.判断机器上是否有ssh服务,
[linuxidc @ www.codesky.net Desktop]$ ssh -verison
OpenSSH_5.3p1,OpenSSL1.0.0-fips29Mar2010
Bad escape character 'rison'.我的系统自带的,所以不用装了。
2.判断机器上是否有JDK
[linuxidc @ www.codesky.net Desktop]$ java -version
javaversion"1.6.0_24"
OpenJDKRuntimeEnvironment(IcedTea61.11.1)(rhel-1.45.1.11.1.el6-i386)
OpenJDKServerVM(build20.0-b12,mixedmode)
[[email protected]]$javac-version
javac 1.6.0_24如果是系统自带的JDK,最好重装一下。 http://www.codesky.net/Linux/2012-08/67185.htm
进入主题
1.下载和安装Hadoop,我下载的是hadoop-0.20.2.tar.gz
解压文件:[root@ www.codesky.net Downloads]# tar -zxvf Hadoop-0.20.2.tar.gz
移动文件:[root@ www.codesky.net Downloads]# mv Hadoop-0.20.2 /usr/local/
安装文件:[root@ www.codesky.net Downloads]# ln -s Hadoop-0.20.2 hadoop
2.修改环境变量
[root@ www.codesky.net local]#vi /etc/profile
在文件的下面添加,不能直接在文件的上面添加
export Hadoop_HOME=/usr/local/hadoopexport PATH=$PATH:$Hadoop_HOME/bin
[root@ www.codesky.net local]#. /etc/profile
[root@ www.codesky.net local]# vi /usr/local/Hadoop/conf/hadoop.env.sh(配置JAVA_HOME)
[root@ www.codesky.net Desktop]# Hadoop version
Hadoop0.20.2
Subversionhttps://svn.apache.org/repos/asf/Hadoop/common/branches/branch-0.20-r911707
Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010压轴大戏
1.NameNode配置
[Hadoop@hadoop1 ~]# vi /etc/hosts
192.168.127.145Hadoop1
192.168.127.146Hadoop2
192.168.127.147Hadoop3
192.168.127.148 Hadoop4[root@ www.codesky.net conf]# vi core-site.xml
- <configuration>
- <property>
- <name>fs.default.name</name>
- <value>hdfs://Hadoop1:9000</value>
- </property>
- </configuration>
[root@ www.codesky.net conf]# vi hdfs-site.xml
- <configuration>
- <property>
- <name>dfs.replication</name>
- <value>3</value>
- </property>
- <property>
- <name>dfs.name.dir</name>
- <value>/usr/local/Hadoop/namenode/</value>
- </property>
- <property>
- <name>Hadoop.tmp.dir</name>
- <value>/usr/local/Hadoop/tmp/</value>
- </property>
- </configuration>
[root@ www.codesky.net conf]# vi mapred-site.xml
- <configuration>
- <property>
- <name>mapred.job.tracker</name>
- <value>Hadoop1:9001</value>
- </property>
- <property>
- <name>mapred.tasktracker.map.tasks.maximum</name>
- <value>4</value>
- </property>
- <property>
- <name>mapred.tasktracker.reduce.tasks.maximum</name>
- <value>4</value>
- </property>
- </configuration>
datanode配置 (只需修改hdfs-site.xml,mapred-site.xml 和core-site.xml跟NameNode一样 )[Hadoop@hadoop2 ~]$ vi hdfs-site.xml
- <configuration>
- <property>
- <name>dfs.replication</name>
- <value>3</value>
- </property>
- <property>
- <name>dfs.data.dir</name>
- <value>/home/Hadoop/data</value>
- </property>
- <property>
- <name>Hadoop.tmp.dir</name>
- <value>/usr/local/Hadoop/tmp/</value>
- </property>
- </configuration>
[Hadoop@hadoop1 conf]$ vi masters
Hadoop1
[Hadoop@hadoop1conf]$vislaves
Hadoop2
Hadoop3
Hadoop4
[Hadoop@hadoop1~]$start-all.sh
[Hadoop@hadoop1~]$stop-all.sh