HBase命令行基本操作汇总分享
关于HBase是一个分布式的、面向列的开源数据库,该技术来源于 Fay Chang 所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统(File System)所提供的分布式数据存储一样,HBase在Hadoop之上提供了类似于Bigtable的能力。HBase是Apache的Hadoop项目的子项目。HBase不同于一般的关系数据库,它是一个适合于非结构化数据存储的数据库。另一个不同的是HBase基于列的而不是基于行的模式。
[hdfs@hadoop1 root]$ hbase
Usage: hbase [] []
Options:
--config DIR Configuration direction to use. Default: ./conf
--hosts HOSTS Override the list in 'regionservers' file
Commands:
Some commands take arguments. Pass no args or -h for usage.
shell Run the HBase shell
hbck Run the hbase 'fsck' tool
hlog Write-ahead-log analyzer
hfile Store file analyzer
zkcli Run the ZooKeeper shell
upgrade Upgrade hbase
master Run an HBase HMaster node
regionserver Run an HBase HRegionServer node
zookeeper Run a Zookeeper server
rest Run an HBase REST server
thrift Run the HBase Thrift server
thrift2 Run the HBase Thrift2 server
clean Run the HBase clean up script
classpath Dump hbase CLASSPATH
mapredcp Dump CLASSPATH entries required by mapreduce
pe Run PerformanceEvaluation
ltt Run LoadTestTool
version Print the version
CLASSNAME Run the class named CLASSNAME
进入命令行
[root@hadoop1 ~]# hbase shell
退出命令行
hbase(main):001:0> exit
查看全部表
hbase(main):001:0> list TABLE 0 row(s) in 1.9500 seconds
=> []
创建表
create '表名称','列名称1',...,'列名称n'
eg:建立一个表scores,有两个列族grad和courese。
hbase(main):004:0> create 'scores','grad','courese'
0 row(s) in 1.5820 seconds
=> Hbase::Table - scores
hbase(main):005:0> list
TABLE
scores
1 row(s) in 0.0080 seconds
=> ["scores"]
添加记录
put '表名','行键名','列名','单元格值','时间戳'
时间戳可以省略。
hbase(main):009:0> put 'scores','Tom','grad:','5'
hbase(main):011:0> put 'scores','Tom','courese:math','100'
hbase(main):012:0> put 'scores','Tom','courese:art','100'
hbase(main):013:0> put 'scores','Mark','grad','6'
hbase(main):014:0> put 'scores','Mark','courese:english','120'
hbase(main):015:0> put 'scores','Mark','courese:chinese','108'
查找某条记录
hbase(main):020:0> get 'scores','Mark'
COLUMN CELL
courese:chinese timestamp=1435491529683, value=108
courese:english timestamp=1435491508206, value=120
grad: timestamp=1435491484521, value=6
3 row(s) in 0.0520 seconds
hbase(main):021:0> get 'scores','Mark','grad'
COLUMN CELL
grad: timestamp=1435491484521, value=6
1 row(s) in 0.0390 seconds
统计行数
hbase> count 'ns1:t1'
hbase> count 't1'
hbase> count 't1', INTERVAL => 100000
hbase> count 't1', CACHE => 1000
hbase> count 't1', INTERVAL => 10, CACHE => 1000
统计一般比较耗时,使用mapreduce进行统计,统计结果会缓存,默认是10行,统计间隔默认是1000行。
hbase(main):038:0> count 'scores'
2 row(s) in 0.0290 seconds
=> 2
修改表结构
增加一列族
hbase(main):048:0> alter 'scores',NAME=>'info'
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 2.4330 seconds
删除一个列族
hbase(main):053:0> alter 'scores',NAME=>'NAME=info',METHOD=>'delete'
Updating all regions with the new schema...
0/1 regions updated.
1/1 regions updated.
Done.
0 row(s) in 2.4670 seconds
hbase(main):055:0> alter 'scores','delete'=>'courese'
不过不能删除掉,先disable'scores',修改完之后再enable'scores'。
删除表
先禁用表
hbase(main):057:0> disable 'scores'
然后删除表
hbase(main):057:0> drop 'scores'
删除指定数据
delete 'scores','Mark','courese:english'
删除整行
deleteall '表名','行键'
deleteall 'scores','Mark'(慎用)
清空表:表结构仍然在
truncate 'scores'