Hive安装配置学习笔记
转载请标明出处SpringsSpace: http://springsfeng.iteye.com
1 . 首先请安装好MySQL并修改root账户密码,使用root账户执行下面命令:
su - root
mysql
GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY 'root' WITH GRANT OPTION;
2. 创建Hive用户: 使用root账户执行下面命令:
su - root
mysql -uroot -p
CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive';
CREATE USER 'hive'@'linux-fdc.linux.com' IDENTIFIED BY 'hive';
CREATE USER 'hive'@'192.168.81.251' IDENTIFIED BY 'hive';
CREATE DATABASE metastore;
CREATE DATABASE metastore DEFAULT CHARACTER SET latin1 DEFAULT COLLATE latin1_swedish_ci;
GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'localhost' IDENTIFIED BY 'hive' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'192.168.81.251' IDENTIFIED BY 'hive' WITH GRANT OPTION;
flush privileges;
3. 导入MySQL 脚本
使用hive账户登录:
mysql -uhive -p -h192.168.81.251
mysql> use metastore;
Database changed
mysql> source /opt/custom/hive-0.11.0/scripts/metastore/upgrade/mysql/hive-schema-0.10.0.mysql.sql
4. Hive安装配置
(1) 编译:针对当前Hive-0.11.0-SNAPSHOT版本
下载最新的Hive源码包:hive-trunk.zip, 解压至:/home/kevin/Downloads/hive-trunk,修改:
build.properties, 中:
... hadoop-0.20.version=0.20.2 hadoop-0.20S.version=1.1.2 hadoop-0.23.version=2.0.3-alpha ...
若需修改其他依赖包的版本,请修改:ivy目录下的libraries.properties文件, 例如修改hbase
的版本:
...
guava-hadoop23.version=11.0.2
hbase.version=0.94.6
jackson.version=1.8.8
...
在当前目录下执行:
ant tar -Dforrest.home=/usr/custom/apache-forrest-0.9
第3小节。
(2) 解压:从编译后的build下Copy hive-0.11.0-SNAPSHOT.tar.gz至:/usr/custom/并解压。
(3) 配置环境变量:
exprot HIVE_HOME=/usr/custom/hive-0.11.0
exprot PATH=$HIVE_HOME/bin:$PATH
(4) 配置文件:
复制conf目录下的.template生成对应的.xml或.properties文件:
cp hive-default.xml.template hive-site.xml
cp hive-log4j.properties.template hive-log4j.properties
(5) 配置hive-config.sh:
... # # processes --config option from command line # export JAVA_HOME=/usr/custom/jdk1.6.0_43 export HIVE_HOME=/usr/custom/hive-0.11.0 export HADOOP_HOME=/usr/custom/hadoop-2.0.3-alpha this="$0" while [ -h "$this" ]; do ls=`ls -ld "$this"` link=`expr "$ls" : '.*-> \(.*\)$'` if expr "$link" : '.*/.*' > /dev/null; then this="$link" else this=`dirname "$this"`/"$link" fi done ...
(6) 配置日志hive-log4j.properties:针对0.10.0版本的特别处理。
将org.apache.hadoop.metrics.jvm.EventCounter改成:org.apache.hadoop.log.metrics
.EventCounter , 这样将解决异常:
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated.
Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
(7) 创建hive-site文件:
<configuration> <!-- WARNING!!! This file is provided for documentation purposes ONLY! --> <!-- WARNING!!! Any changes you make to this file will be ignored by Hive. --> <!-- WARNING!!! You must make your changes in hive-site.xml instead. --> <!-- Hive Execution Parameters --> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/metastore_db?createDatabaseIfNotExist=true</value> <description> JDBC connect string for a JDBC metastore. 请注意上面value标签之间的部分前后之间不能有空格,否则HiveClient提示: FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient </description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> <description>password to use against metastore database</description> </property> <property> <name>hive.metastore.uris</name> <value>thrift://linux-fdc.linux.com:8888</value> <description> Thrift uri for the remote metastore. Used by metastore client to connect to remote metastore. </description> </property> </configuration>
(8) 配置MySQL-Connector-Java
下载mysql-connector-java-5.1.22-bin.jar放在/usr/custom/hive-0.11.0/lib目录下,否则执行
show tables; 命令时提示找不到ConnectionDriverName。
5. 启动使用
(1) 启动
进入bin目录下,执行命令:hive
(2) 查看当前库及表
show databases; //默认为:default
show tables;
(3) 创建表示例
这部分为我自己的测试, 测试数据见附件。
CREATE TABLE cite (citing INT, cited INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
CREATE TABLE cite_count (cited INT, count INT);
INSERT OVERWRITE TABLE cite_count
SELECT cited,COUNT(citing)
FROM cite
GROUP BY cited;
SELECT * FROM cite_count WHERE count > 10 LIMIT 10;
CREATE TABLE age (name STRING, birthday INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
CREATE TABLE age_out (birthday INT, birthday_count INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE;
(4) 查看表结构
desribe cite;
(5) 加载数据
hive> LOAD DATA LOCAL INPATH '/home/kevin/Documents/age.txt' OVERWRITE INTO TABLE age;