csv文件导入hive表
1 csv格式(就是讲mysql表数据通过Sqlyog 导出时,默认的导出文件格式)
CSV格式的文件也称为逗号分隔值(Comma-Separated Values,CSV,有时也称为字符分隔值,因为分隔字符也可以不是逗号。在本文中的CSV格式的数据就不是简单的逗号分割的),其文件以纯文本形式存储表格数据(数字和文本)。CSV文件由任意数目的记录组成,记录间以某种换行符分隔;每条记录由字段组成,字段间的分隔符是其它字符或字符串,最常见的是逗号或制表符。通常,所有记录都有完全相同的字段序列。
1.1 导出后,可以指定导出时字符间隔(默认是\t)和字符包裹类型(可以不指定包裹类型), 如下图:
2 hive支持导入 .csv格式数据,步骤如下:
a)
导出后看导出样子,建议使用txt格式打开,这样可以看到字符的间隔,如果用excel打开,是看不到字符之间
是用你指定的字符还是用默认\t间隔的了,
这里我导出的文件用txt打开如下, 内容没用'' 包裹
12,1.71301E+15,23G,15589836997,20141201,2,532,13606343566,1,532,0,0,0,1,91,2 12,1.71207E+15,23G,18661866329,20141201,1,25,18952082990,3,25,0,2,0,1,31,1 12,1.71307E+15,23G,13026513953,20141201,1,530,15269099707,1,530,1,1,0,2,667,12 12,3.20812E+15,23G,13061276785,20141201,1,532,13954223917,1,532,0,0,0,1,18,1 12,3.21009E+15,23G,15653208256,20141201,1,532,15864736958,1,532,0,0,0,1,15,1 12,1.71312E+15,23G,13256887098,20141201,1,532,15264276875,1,532,0,0,0,1,45,1
b) hive中创建表:
create table cvs ( month_id string, user_no string, net_type string, device_number string, start_date string, org_trm_id string, other_home_code string, oppose_number string, oppose_number_type string, other_roam_code string, roam_type string, long_type string, call_hour_seg string, cdr_num string, call_time string, fee_number string ) row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde' with serdeproperties ( "separatorChar" = ",", "escapeChar" = "\\") STORED AS TEXTFILE;
这是hive创建对饮格式表最全的写法,如下
CREATE TABLE csv_table(a string, b string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ( "separatorChar" = "\t", "quoteChar" = "'", "escapeChar" = "\\") STORED AS TEXTFILE;
c) 导出数据上传到linux 后 hive 从linux中将数据导入到hive表:
load data local inpath 'yuyin.csv' into table cvs;
d) 查询:
hive (default)> select * from cvs limit 10; OK cvs.month_id cvs.user_no cvs.net_type cvs.device_number cvs.start_date cvs.org_trm_id cvs.other_home_code cvs.oppose_number cvs.oppose_number_type cvs.other_roam_code cvs.roam_type cvs.long_type cvs.call_hour_seg cvs.cdr_num cvs.call_time cvs.fee_number 12 1.71E+15 23G 15589836997 20141201 2 532 13606343566 1 532 0 0 0 1 91 2 12 1.71E+15 23G 18661866329 20141201 1 25 18952082990 3 25 0 2 0 1 31 1 12 1.71E+15 23G 13026513953 20141201 1 530 15269099707 1 530 1 1 0 2 667 12 12 3.21E+15 23G 13061276785 20141201 1 532 13954223917 1 532 0 0 0 1 18 1 12 3.21E+15 23G 15653208256 20141201 1 532 15864736958 1 532 0 0 0 1 15 1 12 1.71E+15 23G 13256887098 20141201 1 532 15264276875 1 532 0 0 0 1 45 1 12 3.21E+15 23G 15692326467 20141201 2 532 15969838768 1 532 0 0 0 1 7 1 12 3.71E+15 23G 18561738929 20141201 1 535 17862806081 1 535 1 0 0 1 12 1 12 1.71E+15 23G 13127055909 20141201 1 530 13573075730 1 530 0 1 0 1 48 1 12 2.21E+15 23G 15689487889 20141201 1 532 15063978623 1 532 0 0 0 1 39 1 Time taken: 2.042 seconds, Fetched: 10 row(s)