Hive分享教程2-Hive语法

Hive的语法与SQL相似,细看例子

creatingtables

   
CREATE TABLE tmp_table #表名
    {
     title string , #字段名 字段类型
     minimun_bid double,
     quantity bigint,
     have_invoice bigint
    }COMMENT 'XXX' #表注解
    PARTITIONED BY(pt STRINT) #分区表字段(文件大则采用分区)
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY '\001' #字段分割
    STORED AS SEQUENCEFILE; #采用那种存储方式存储数据  SEQUENCEFILE是hadoop自带的文件压缩格式

一些常用的命令

SHOWTABLES;#查看所有的表

SHOWTALBES'*TMP*'#支持模糊查询

SHOWPARTITIONSTABLE;#查看表有哪些分区

DESCRIBETABLE;#查看表结构

例子:

 
CREATE TABLE page_view(viewTime INT, userid BIGINT,
                    page_url STRING, referrer_url STRING,
                    ip STRING COMMENT 'IP Address of the User')
    COMMENT 'This is the page view table'
    ROW FORMT DELIMITED FIELDS TERMINATED BY '44' LINES TERMINATED BY '12'
    STORED AS TEXTFILE
    LOCATION '/user/data/staging/page_view';
    
    hadoop dfs -put /tmp/pv_2008-06-08.txt /user/data/staging/page_view

    FROM page_view_stg pvs
    INSERT OVERWRITE TABLE page_view PARTITION(dt='2008-06-08', country='US')
    SELECT pvs.viewTime, pvs.userid, pvs.page_url, pvs.referrer_url, null, null, pvs.ip
    WHERE pvs.country = 'US';

另外,HDFS不支持UPDATE!

用户可以用ALTERTABLEADDPARTITION来向一个表中增加分区

当分区名是字符串串时加引号

 
ALTER TABLE page_view ADD PARTITION(dt='2008-08-08',country='us') location 'path/to/us/part080808'
  PARTITION(dt='2008-08-09',country='us') location 'path/to/us/part080809'

loadingfilesintotable

当数据被加载至表中时,不会对数据进行任何转换。Load操作知识将数据复制/移动至Hive表对应的位置。

Syntax:

LOADDATA[LOCAL]INPATH'filepath'[OVERWRITE]INTOTABLEtablename[PARTITION(partcol1=val1,partcol2=val2...)]

即,普通的insert只不过数据来源是通过inpath路径找到,insert之前保证表已经建立完,并且格式一致(换行、分隔符等)

相关推荐