Hadoop----搭建HDFS分布式文件系统

1。系统环境:
      VMWARE 虚拟机共3台
      操作系统:RHEL5.4(x86_64)
      内存: 256M

       hostname: master      IP:192.168.204.128     (namenode,secondnamenode)
       hostname: slave01     IP:192.168.204.129     (datanode)     
       hostname: slave02     IP:192.168.204.134     (datanode)

2。Hadoop
      版本:hadoop-0.20.203.0rc1.tar.gz
      下载地址: http://www.apache.org/dyn/closer.cgi/hadoop/common/
 
      JAVA:
      版本:jdk-6u26-linux-x64.bin
      下载地址:http://www.Oracle.com/technetwork/java/javase/downloads/jdk-6u26-download-400750.html

3。安装hadoop
      1)# tar zxf hadoop-0.20.203.0rc1.tar.gz -C /usr/local
      2)# chmod +x
jdk-6u26-linux-x64.bin
           # ./jdk-6u26-linux-x64.bin
           # mv jdk1.6.0_26 /usr/local/.
      3)添加环境变量
           # vi /usr/local/hadoop-0.20.203.0/conf/hadoop-env.sh
              export JAVA_HOME=/usr/local/jdk1.6.0_26
      4)根据测试服务器内存,修正一些heap值
            # vi /usr/local/hadoop-0.20.203.0/bin/hadoop
               JAVA_HEAP_MAX=-Xmx128m
            # vi /usr/local/hadoop-0.20.203.0/conf/hadoop-env.sh
               export HADOOP_HEAPSIZE=128

4。建立配置文件及相关目录
        1)建立目录

             在master服务器上,建立目录 /data/hadoop/name、/data/hadoop/tmp
             在slave01,slave02服务器上,建立目录 /data/hadoop/data01、/data/hadoop/data02、/data/hadoop/tmp
        2)建立无密码验证的ssh密钥及更改文件权限
             (master、slave01、slave02)
              # useradd hadoop       
              # chown hadoop:hadoop -R /usr/local/hadoop-0.20.203.0
              # chown hadoop:hadoop -R /data/hadoop/name /data/hadoop/tmp
              # chown hadoop:hadoop -R /data/hadoop/data01 /data/hadoop/data02  /data/hadoop/tmp
            
              master 服务器:
              [hadoop@master]$ ssh-key ssh-keygen -t dsa
              [hadoop@master]$ cat /home/hadoop/.ssh/id_dsa.pub > /home/hadoop/.ssh/authorized_keys
              [hadoop@master]$ chmod 600 /home/hadoop/.ssh/authorized_keys
     
              [hadoop@master]$ scp -rp /home/hadoop/.ssh/id_dsa.pub  hadoop@slave01:/home/hadoop/.
              [hadoop@master]$ scp -rp /home/hadoop/.ssh/id_dsa.pub  hadoop@slave02:/home/hadoop/.
              在slave01和slave02 上分别执行:
              [hadoop@slavexx]$ cat /home/hadoop/id_dsa.pub > /home/hadoop/.ssh/authorized_keys
              [hadoop@slavexx]$ chmod 600 /home/hadoop/.ssh/authorized_keys
           
              在master上测试ssh密钥是否成功配置:
              [hadoop@master]$ ssh slave01
              [hadoop@master]$ ssh slave02

        3)在master上修改配置文件
修改完成后,拷贝到slave01和slave02保证配置文件保持一致
              /usr/local/hadoop-0.20.203.0/conf/ 目录下的:core-site.xml  hdfs-site.xml  mapred-queue-acls.xml
              样例配置文件目录:/usr/local/hadoop-0.20.203.0/src/hdfs/hdfs-default.xml
                                             /usr/local/hadoop-0.20.203.0/src/core/core-default.xml
                                             /usr/local/hadoop-0.20.203.0/src/mapred/mapred-default.xml
              最简单的配置文件内容:
              core-site.xml
              **************************************************************************
               <?xml version="1.0"?>
                   <?xml-stylesheet type="text/xsl" href="http://shanchao7932297.blog.163.com/blog/configuration.xsl"?>
                   <!-- Put site-specific property overrides in this file. -->
                   <configuration>
                       <property>
                            <name>hadoop.tmp.dir</name>
                            <value>/data/hadoop/tmp</value>
                            <description>A base for other temporary directories.</description>
                        </property>
                        <!-- file system properties -->
                        <property>
                             <name>fs.default.name</name>
                             <value>hdfs://master:9000</value>        
                         </property>
                    </configuration>

              **************************************************************************
              hdfs-site.xml
              **************************************************************************
              <?xml version="1.0"?>
                  <?xml-stylesheet type="text/xsl" href="http://shanchao7932297.blog.163.com/blog/configuration.xsl"?>
                  <!-- Put site-specific property overrides in this file. -->
                  <configuration>
                       <property>
                            <name>dfs.replication</name>
                            <value>2</value>
                       </property>
                       <property>
                            <name>dfs.name.dir</name>
                            <value>/data/hadoop/name</value>
                        </property>
                        <property>
                             <name>dfs.data.dir</name>
                             <value>/data/hadoop/data01,/data/hadoop/data02</value>
                        </property>
                    </configuration>

              **************************************************************************      
              masters配置文件     --指定namenode服务器
              **************************************************************************      
              master
              **************************************************************************      
              slaves配置文件     --指定datanode服务器
              **************************************************************************            
              slave01
              slave02
              **************************************************************************            
5。格式化namenode,并启动hdfs服务
      1)格式化namenode
           
[hadoop@slavexx]$  ./hadoop namenode -format
            注意查看相关log是否报错

      2)启动hdfs服务
           [hadoop@slavexx]$ ../start-dfs.sh
           注意查看相关log是否报错
           如果成功,则在
           master上会启动2个进程:
           /usr/local/jdk1.6.0_26/bin/java -Dproc_namenode -Xmx128m  ×××××××××××××××××××××××××××××××
           /usr/local/jdk1.6.0_26/bin/java -Dproc_secondarynamenode -Xmx128m ××××××××××××××××××××××××
         
           slave01、slave02上分别启动一个进程:
           /usr/local/jdk1.6.0_26/bin/java -Dproc_datanode -Xmx128m

6。测试向hdfs文件系统上传文件:
      准备一个文件  名称:/home/hadoop/soft.tar 大小:400M
      ./hadoop fs            (显示可执行的命令列表)
      ./hadoop fs  -help  (查看帮助详细信息)
      ./hadoop fs mkdir /sshan/test  (创建目录)
      ./hadoop fs -ls /sshan/test       (查看目录内容)
      ./hadoop fs -put
/home/hadoop/soft.tar /sshan/test/  (上传文件soft.tar 到 /sshan/test 目录下)
      ./hadoop fs -ls /sshan/test   (查看文件内容)

7。web界面查看  http://192.168.204.128:50070

相关推荐