HBase(1)Introduction and Installation

鲸鱼写程序

2014-08-05

HBase(1)Introduction and Installation

1. HBase Introduction
Hadoop Database ——> Hadoop HDFS
Hadoop Database ——>Hadoop MapReduce
Hadoop Database ——> Hadoop Zookeeper

Fundamentally Distributed — partitioning(sharding), replication
Column Oriented
Sequential Write in memory——flush to disk
Merged Read
Periodic Data Compation

Pig(Data Flow) Hive(SQL), Sqoop(RDBMS importing support)

HMaster Server: Region assignment Mgmt(Hadoop Master,NameNode,JobTracker)

HRegionServer #1:DateNode, TaskTracker

2. Install and Setup Hadoop
Install protoc
>wgethttps://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz
Unzip and cd to that directory
>./configure --prefix=/Users/carl/tool/protobuf-2.5.0
>make
>make install
>sudo ln -s /Users/carl/tool/protobuf-2.5.0 /opt/protobuf-2.5.0
>sudo ln -s /opt/protobuf-2.5.0 /opt/protobuf

Add this line to my environment
export PATH=/opt/protobuf/bin:$PATH

Check the Installation Environment
>protoc --version
libprotoc 2.5.0

Compile Hadoop
>svn checkouthttp://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0hadoop-common-2.4.0

Read the document here for building
http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0/BUILDING.txt
>cd hadoop-common-2.4.0/
>mvn clean install -DskipTests
>cd hadoop-mapreduce-project
>mvn clean install assembly:assembly -Pnative

Maybe my machine is too slow, so a lot of timeout Error on my machine. So I redo it like this
>mvn clean -DskipTests install assembly:assembly -Pnative

Need to get rid of the native
>mvn clean -DskipTests install assembly:assembly

Not working, read the document INSTALL
http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0/hadoop-mapreduce-project/INSTALL
>cd ..
>mvn clean package -Pdist -Dtar -DskipTests

Lastest document from here
/Users/carl/data/installation/hadoop-2.4.0/share/doc/hadoop/hadoop-project-dist/hadoop-common/SingleCluster.html

Follow the BLOG and set hadoop on 4 machines
http://sillycat.iteye.com/blog/2084169
http://sillycat.iteye.com/blog/2090186

>sbin/start-dfs.sh
>sbin/start-yarn.sh
>sbin/mr-jobhistory-daemon.sh start historyserver

3. Setup Zookeeper
Follow the BLOG and set zookeeper on 3 machines
http://sillycat.iteye.com/blog/2015175

>zkServer.sh start conf/zoo-cluster.cfg

4. Try install HBase - Standalone HBase

Download this version since I am using hadoop 2.4.x
>wget http://mirrors.gigenet.com/apache/hbase/hbase-0.98.4/hbase-0.98.4-hadoop2-bin.tar.gz
Unzip the file and move it to the work directory.

>sudo ln -s /home/carl/tool/hbase-0.98.4 /opt/hbase-0.98.4
>sudo ln -s /opt/hbase-0.98.4 /opt/hbase

Check and modify the configuration file
>cat conf/hbase-site.xml
<configuration>

<name>hbase.rootdir</name>

<value>file:///opt/hbase</value>

</property>

<name>hbase.zookeeper.property.dataDir</name>

<value>/home/carl/etc/hbase</value>

</property>

</configuration>

Start the Service
>bin/start-hbase.sh

>jps
2036 NameNode 4084 Jps 3340 HMaster 2403 ResourceManager 2263 SecondaryNameNode 2686 JobHistoryServer

Enter the Client Shell
>bin/hbase shell

Create the table
>create 'test', 'cf'

Check the info on that table
>list 'test'

Inserting data
>put 'test', 'row1', 'cf:a', 'value1'
>put 'test', 'row2', 'cf:a', 'value2'
>put 'test', 'row3', 'cf:a', 'value3'
row1 should be the row key, column will be cf:a, value should be value1.

Get all the data
>scan 'test'
ROW COLUMN+CELL row1 column=cf:a, timestamp=1407169545627, value=value1 row2 column=cf:a, timestamp=1407169557668, value=value2 row3 column=cf:a, timestamp=1407169563458, value=value3 3 row(s) in 0.0630 seconds

Get a single row
>get 'test', 'row1'
COLUMN CELL cf:a timestamp=1407169545627, value=value1

Some other command
>disable ‘test’
>enable ‘test’
>drop ‘test'

5. Pseudo-Distributed Local Install
Change the configuration as follow
<configuration>

<name>hbase.rootdir</name>

<value>hdfs://ubuntu-master:9000/hbase</value>

</property>

<name>hbase.zookeeper.property.dataDir</name>

<value>/home/carl/etc/hbase</value>

</property>

<name>hbase.cluster.distributed</name>

</property>

<name>hbase.master.wait.on.regionservers.mintostart</name>

</property>

</configuration>

List the HDFS directory
>hadoop fs -ls /
Found 4 items drwxr-xr-x - carl supergroup 0 2014-07-09 13:22 /data drwxr-xr-x - carl supergroup 0 2014-08-04 11:47 /hbase drwxr-xr-x - carl supergroup 0 2014-07-10 13:09 /output drwxrwx--- - carl supergroup 0 2014-08-04 11:21 /tmp

>hadoop fs -ls /hbase
Found 6 items drwxr-xr-x - carl supergroup 0 2014-08-04 11:48 /hbase/.tmp drwxr-xr-x - carl supergroup 0 2014-08-04 11:47 /hbase/WALs drwxr-xr-x - carl supergroup 0 2014-08-04 11:48 /hbase/data -rw-r--r-- 3 carl supergroup 42 2014-08-04 11:47 /hbase/hbase.id -rw-r--r-- 3 carl supergroup 7 2014-08-04 11:47 /hbase/hbase.version drwxr-xr-x - carl supergroup 0 2014-08-04 11:47 /hbase/oldWALs

Start HMaster Backup Servers
The default port number for HMaster is 16010, 16020, 16030
>bin/local-master-backup.sh start 2 3 5

That will start 3 HMaster backup Server on 16012/16022/16032,16013/16023/16033, 16015/16025/16035

Find the process id in /tmp/hbase-USERS-x-master.pid to stop the server

For example
>cat /tmp/hbase-carl-2-master.pid
6442
>cat /tmp/hbase-carl-5-master.pid |xargs kill -9

Start and stop Additional RegionServers
The default port is 16020,16030. But the base additional ports are 16200, 16300.
>bin/local-regionservers.sh start 2 3 5

>bin/local-regionservers.sh stop 5

6. Fully Distributed
I have the 4 machines, I will list them as follow:
ubuntu-master hmaster
ubuntu-client1 hmaster-backup
ubuntu-client2 regionserver
ubuntu-client3 regionserver

Set up the Configuration
>cat conf/regionservers
ubuntu-client2 ubuntu-client3

>cat conf/backup-masters
ubuntu-client1

Since I already have the ZK running, so
>vi conf/hbase-env.sh
export HBASE_MANAGES_ZK=false

The main configuration file
>cat conf/hbase-site.xml
<configuration>

<name>hbase.rootdir</name>

<value>hdfs://ubuntu-master:9000/hbase</value>

</property>

<name>hbase.zookeeper.property.dataDir</name>

<value>/home/carl/etc/hbase</value>

</property>

<name>hbase.cluster.distributed</name>

</property>

<name>hbase.master.wait.on.regionservers.mintostart</name>

</property>

<name>hbase.zookeeper.quorum</name>

<value>ubuntu-client1,ubuntu-client2,ubuntu-client3</value>

</property>

<name>hbase.zookeeper.property.dataDir</name>

<value>/home/carl/etc/zookeeper</value>

</property>

</configuration>

The last step is just to start the server
>bin/start-hbase.sh

Visit the web UI
http://ubuntu-master:60010/master-status

References:
https://hbase.apache.org/http://www.alidata.org/archives/1509

http://blog.csdn.net/heyutao007/article/details/6920882
http://blog.sina.com.cn/s/blog_5c5d5cdf0101dvgq.html hadoop hbase zookeeper
http://www.cnblogs.com/ventlam/archive/2011/01/22/HBaseCluster.html

http://www.searchtb.com/2011/01/understanding-hbase.html
http://www.searchdatabase.com.cn/showcontent_31652.htm

hadoop
http://sillycat.iteye.com/blog/1556106
http://sillycat.iteye.com/blog/1556107

tips about hadoop
http://blog.chinaunix.net/uid-20682147-id-4229024.html
http://my.oschina.net/skyim/blog/228486
http://blog.huangchaosuper.cn/work/tech/2014/04/24/hadoop-install.html
http://blog.sina.com.cn/s/blog_45d2413b0102e2zx.html
http://www.it165.net/os/html/201405/8311.html

hbase hadoop install introduction

安科网

HBase(1)Introduction and Installation

鲸鱼写程序

鲸鱼写程序

相关推荐

hdfs、hive、hbase的搭建总结

在hadoop集群下启动hbase的方法

Flume-0.9.4和Hbase-0.96整合

HBase的安装部署

HBase/TiDB都在用的数据结构：LSM Tree，不得了解一下？

hbase 基础 —— 架构

hbase 建表数据类型

Hbase常见问题

hue集成hbase

HBase安装部署

Spark读取Hbase中的数据

Spark读取Mysql，Redis，Hbase数据（一）

Spark 与 JDBC、Hbase之间的交互

1，pinpoint全链路监控

HBase与Hive

HBase与MapReduce交互

HBase原理总结

Hbase scan 查询命令大全，前缀，模糊，正则

Hbase API 创建表错误记录 for Docker 容器部署集群

hbase设置ttl后出现坏块，重启后master abort 问题梳理

鲸鱼写程序