安装的Hive

genshengxiao

2016-01-17

安装的Hive是Hive最新版本中的稳定版本，是基于Hadoop2.2.0，以前有写过，如何在hadoop1.x下面安装Hive0.8，本次Hive的版本是Hive0.13，可以直接在Hive官网上下载二进制包，无须进行源码编译。Hive需要依赖底层的Hadoop环境，所以在安装Hive前，请确保你的hadoop集群环境已经可以正常工作。
Hive0.13稳定版本的下载地址
http://apache.fayea.com/apache-mirror/hive/stable/
关于Hadoop2.2.0分布式集群的搭建
http://qindongliang1922.iteye.com/blog/2078423
MySQL的安装
http://qindongliang1922.iteye.com/blog/1987199

下载具体看下安装的步骤和过程:

序号

描述

Hadoop2.2.0集群的搭建

底层依赖环境

下载Hive0.13的bin包，并解压

Hive包

配置HIVE_HOME环境变量

环境变量所需

配置hive-env.sh

涉及hadoop的目录，和hive的conf目录

配置hive-site.xml

配置hive属性和集成MySQL存储元数据

启动bin/hive服务

测试启动hive

建库，建表,测试hive

测试hive是否正常工作

退出Hive客户端

执行命令exit

工程师一枚

开工

拷贝mysql的jdbc包到hive的lib目录下

元数据存储为MySQL

hadoop技术交流群

376932160

首先，先执行如下4个命令，把Hive自带的模板文件，变为Hive实际所需的文件：
cp hive-default.xml.template hive-site.xml
cp hive-env.sh.template hive-env.sh
cp hive-exec-log4j.properties.template hive-exec-log4j.properties
cp hive-log4j.properties.template hive-log4j.properties

Hive环境变量的设置:

[search@h1 hive]$ bin/hive  
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces  
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize  
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative  
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node  
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive  
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack  
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize  
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed  
14/07/30 04:18:09 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.  
  
Logging initialized using configuration in file:/home/search/hive/conf/hive-log4j.properties  
hive>

[search@h1 hive]$ bin/hive
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
14/07/30 04:18:08 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
14/07/30 04:18:09 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.

Logging initialized using configuration in file:/home/search/hive/conf/hive-log4j.properties
hive>

执行，建表命令，并导入数据：

建表:
create table mytt (name string ,count int) row format delimited fields terminated by '#' stored as textfile ;
导入数据:
LOAD DATA LOCAL INPATH '/home/search/abc1.txt' OVERWRITE INTO TABLE info;
执行查询命令，并降序输出：

Time taken: 0.837 seconds, Fetched: 5 row(s)  
hive> select * from info limit 5 order by count desc;  
FAILED: ParseException line 1:27 missing EOF at 'order' near '5'  
hive> select * from info   order by count desc  limit 5 ;  
Total jobs = 1  
Launching Job 1 out of 1  
Number of reduce tasks determined at compile time: 1  
In order to change the average load for a reducer (in bytes):  
  set hive.exec.reducers.bytes.per.reducer=<number>  
In order to limit the maximum number of reducers:  
  set hive.exec.reducers.max=<number>  
In order to set a constant number of reducers:  
  set mapreduce.job.reduces=<number>  
Starting Job = job_1406660797211_0003, Tracking URL = http://h1:8088/proxy/application_1406660797211_0003/  
Kill Command = /home/search/hadoop/bin/hadoop job  -kill job_1406660797211_0003  
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1  
2014-07-30 04:26:13,538 Stage-1 map = 0%,  reduce = 0%  
2014-07-30 04:26:26,398 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 5.41 sec  
2014-07-30 04:26:27,461 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 6.64 sec  
2014-07-30 04:26:39,177 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 10.02 sec  
MapReduce Total cumulative CPU time: 10 seconds 20 msec  
Ended Job = job_1406660797211_0003  
MapReduce Jobs Launched:   
Job 0: Map: 1  Reduce: 1   Cumulative CPU: 10.02 sec   HDFS Read: 143906707 HDFS Write: 85 SUCCESS  
Total MapReduce CPU Time Spent: 10 seconds 20 msec  
OK  
英的国  999999  
中的国  999997  
美的国  999996  
中的国  999993  
英的国  999992  
Time taken: 37.892 seconds, Fetched: 5 row(s)  
hive>

Time taken: 0.837 seconds, Fetched: 5 row(s)
hive> select * from info limit 5 order by count desc;
FAILED: ParseException line 1:27 missing EOF at 'order' near '5'
hive> select * from info   order by count desc  limit 5 ;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1406660797211_0003, Tracking URL = http://h1:8088/proxy/application_1406660797211_0003/
Kill Command = /home/search/hadoop/bin/hadoop job  -kill job_1406660797211_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2014-07-30 04:26:13,538 Stage-1 map = 0%,  reduce = 0%
2014-07-30 04:26:26,398 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 5.41 sec
2014-07-30 04:26:27,461 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 6.64 sec
2014-07-30 04:26:39,177 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 10.02 sec
MapReduce Total cumulative CPU time: 10 seconds 20 msec
Ended Job = job_1406660797211_0003
MapReduce Jobs Launched: 
Job 0: Map: 1  Reduce: 1   Cumulative CPU: 10.02 sec   HDFS Read: 143906707 HDFS Write: 85 SUCCESS
Total MapReduce CPU Time Spent: 10 seconds 20 msec
OK
英的国  999999
中的国  999997
美的国  999996
中的国  999993
英的国  999992
Time taken: 37.892 seconds, Fetched: 5 row(s)
hive>

hive shell一些交互式命令的使用方法:

quit,exit:  退出交互式shell  
reset: 重置配置为默认值  
set <key>=<value> : 修改特定变量的值(如果变量名拼写错误，不会报错)  
set :  输出用户覆盖的hive配置变量  
set -v : 输出所有Hadoop和Hive的配置变量  
add FILE[S] *, add JAR[S] *, add ARCHIVE[S] * : 添加 一个或多个 file, jar, archives到分布式缓存  
list FILE[S], list JAR[S], list ARCHIVE[S] : 输出已经添加到分布式缓存的资源。  
list FILE[S] *, list JAR[S] *,list ARCHIVE[S] * : 检查给定的资源是否添加到分布式缓存  
delete FILE[S] *,delete JAR[S] *,delete ARCHIVE[S] * : 从分布式缓存删除指定的资源  
! <command> :  从Hive shell执行一个shell命令  
dfs <dfs command> :  从Hive shell执行一个dfs命令  
<query string> : 执行一个Hive 查询，然后输出结果到标准输出  
source FILE <filepath>:  在CLI里执行一个hive脚本文件

quit,exit:  退出交互式shell
reset: 重置配置为默认值
set <key>=<value> : 修改特定变量的值(如果变量名拼写错误，不会报错)
set :  输出用户覆盖的hive配置变量
set -v : 输出所有Hadoop和Hive的配置变量
add FILE[S] *, add JAR[S] *, add ARCHIVE[S] * : 添加 一个或多个 file, jar, archives到分布式缓存
list FILE[S], list JAR[S], list ARCHIVE[S] : 输出已经添加到分布式缓存的资源。
list FILE[S] *, list JAR[S] *,list ARCHIVE[S] * : 检查给定的资源是否添加到分布式缓存
delete FILE[S] *,delete JAR[S] *,delete ARCHIVE[S] * : 从分布式缓存删除指定的资源
! <command> :  从Hive shell执行一个shell命令
dfs <dfs command> :  从Hive shell执行一个dfs命令
<query string> : 执行一个Hive 查询，然后输出结果到标准输出
source FILE <filepath>:  在CLI里执行一个hive脚本文件

以debug模式启动： hive -hiveconf hive.root.logger=DEBUG,console

至此，我们的Hive，已经安装成功，并可以正常运行。

hive td

安科网

安装的Hive

genshengxiao

genshengxiao

相关推荐

（一）hive远程模式搭建

Hive学习(二)【数据类型、类型转换】

3（Hive）

Hive函数大全-完整版

hdfs、hive、hbase的搭建总结

hive函数之~hive当中的lateral view 与 explode

hive函数之~窗口函数与分析函数

hive函数之~reflect函数

hive函数之~条件函数

hive函数之~日期函数

hive函数之~字符串函数

hive函数之~关系运算

Hive使用

Hive的安装与启动

Hive llap服务安装说明及测试（二）

Hive学习之路（二）Hive安装

Hadoop

Hive1.2.2（一）

hive开窗开窗函数进阶

数据仓库 ODS原始数据层操作

genshengxiao