[Hive] 完全分布式安装过程(MetaStore: MySQL)

Hadoop版本:0.20.2
 
Hive版本:0.9.0
 
mysql版本: 5.6.11
 

 

1) 在mysql里创建hive用户,并赋予其足够权限
 

[root@node01 mysql]# mysql -u root -p
 
Enter password:
 

 


mysql> create user 'hive' identified by 'hive';
 
Query OK, 0 rows affected (0.00 sec)
 

 

mysql> grant all privileges on *.* to 'hive' with grant option;
 
Query OK, 0 rows affected (0.00 sec)
 

 

mysql> flush privileges;
 
Query OK, 0 rows affected (0.01 sec)
 

 

2)测试hive用户是否能正常连接mysql,并创建hive数据库
 

[root@node01 mysql]# mysql -u hive -p
 
Enter password:
 

 


mysql> create database hive;
 
Query OK, 1 row affected (0.00 sec)
 

 

mysql> use hive;
 
Database changed
 
mysql> show tables;
 
Empty set (0.00 sec)
 

 

3)解压缩hive安装包
 

tar -xzvf hive-0.9.0.tar.gz
 

[hadoop@node01 ~]$ cd hive-0.9.0
 
[hadoop@node01 hive-0.9.0]$ ls
 
bin  conf  docs  examples  lib  LICENSE  NOTICE  README.txt  RELEASE_NOTES.txt  scripts  src
 

 

4)下载mysql连接java的驱动 并拷入hive home的lib下
 

[hadoop@node01 ~]$ mv mysql-connector-java-5.1.24-bin.jar ./hive-0.9.0/lib
 

 

5)修改环境变量,把Hive加到PATH
 
/etc/profile
 

export HIVE_HOME=/home/hadoop/hive-0.9.0
 
export PATH=$PATH:$HIVE_HOME/bin
 

 

6)修改hive-env.sh
 

[hadoop@node01 conf]$ cp hive-env.sh.template hive-env.sh
 
[hadoop@node01 conf]$ vi hive-env.sh
 

 

7)拷贝hive-default.xml 并命名为 hive-site.xml
 
修改四个关键配置 为上面mysql的配置
 

[hadoop@node01 conf]$ cp hive-default.xml.template hive-site.xml
 
[hadoop@node01 conf]$ vi hive-site.xml
 

<property>
 
  <name>javax.jdo.option.ConnectionURL</name>
 
  <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
 
  <description>JDBC connect string for a JDBC metastore</description>
 
</property>
 

 


<property>
 
  <name>javax.jdo.option.ConnectionDriverName</name>
 
  <value>com.mysql.jdbc.Driver</value>
 
  <description>Driver class name for a JDBC metastore</description>
 
</property>
 

 


<property>

  <name>javax.jdo.option.ConnectionUserName</name>
 
  <value>hive</value>
 
  <description>username to use against metastore database</description>
 
</property>
 

 

<property>
 
  <name>javax.jdo.option.ConnectionPassword</name>
 
  <value>hive</value>
 
  <description>password to use against metastore database</description>
 
</property>
 

 

8)启动Hadoop,打开hive shell 测试
 

[hadoop@node01 conf]$ start-all.sh
 

 

 

hive> load data inpath 'hdfs://node01:9000/user/hadoop/access_log.txt'
 
    > overwrite into table records;
 
Loading data to table default.records
 
Moved to trash: hdfs://node01:9000/user/hive/warehouse/records
 
OK
 
Time taken: 0.526 seconds
 
hive> select ip, count(*) from records
 
    > group by ip;
 

Total MapReduce jobs = 1
 
Launching Job 1 out of 1
 
Number of reduce tasks not specified. Estimated from input data size: 1
 
In order to change the average load for a reducer (in bytes):
 
  set hive.exec.reducers.bytes.per.reducer=<number>
 
In order to limit the maximum number of reducers:
 
  set hive.exec.reducers.max=<number>
 
In order to set a constant number of reducers:
 
  set mapred.reduce.tasks=<number>
 
Starting Job = job_201304242001_0001, Tracking URL = http://node01:50030/jobdetails.jsp?jobid=job_201304242001_0001
 
Kill Command = /home/hadoop/hadoop-0.20.2/bin/../bin/hadoop job  -Dmapred.job.tracker=192.168.231.131:9001 -kill job_201304242001_0001
 
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
 
2013-04-24 20:11:03,127 Stage-1 map = 0%,  reduce = 0%
 
2013-04-24 20:11:11,196 Stage-1 map = 100%,  reduce = 0%
 
2013-04-24 20:11:23,331 Stage-1 map = 100%,  reduce = 100%
 
Ended Job = job_201304242001_0001
 
MapReduce Jobs Launched:
 
Job 0: Map: 1  Reduce: 1  HDFS Read: 7118627 HDFS Write: 9 SUCCESS
 
Total MapReduce CPU Time Spent: 0 msec
 
OK
 
NULL    28134
 
Time taken: 33.273 seconds
 

 


records在HDFS中就是一个文件:
 

[hadoop@node01 home]$ hadoop fs -ls /user/hive/warehouse/records
 
Found 1 items
 
-rw-r--r--  2 hadoop supergroup    7118627 2013-04-15 20:06 /user/hive/warehouse/records/access_log.txt

相关推荐