使用Cloudera Manager安装Hadoop
Hadoop是由好多不同的服务(比如HDFS,Hive,HBase,Spark等等)构成的,这些服务之间还有些依赖关系,如果直接下Apache上的原始的包,需要下载多次,配置多次,显得比较麻烦。由此就产生了一些对Hadoop进行定制的公司,比如 Cloudera , Hortonworks ,还有 MapR . 这些公司都有自己的Hadoop发行版,Cloudera的发行版叫CDH。因为公司的系统用的就是CDH,这一段时间装了多次CDH5,做了一些实际操作,还看了些Cloudera的文档。
Cloudera使用Cloudera Manager,一个Web界面的Hadoop管理系统来进行Hadoop相关服务的安装,配置,和监控。
安装Cloudera Manager
官方指南在 这里 ,有几种方式,我这里说到的只是其中之一,不是最简单的,但是可以让你对Cloudera Manager有比较多的了解。
可以通过下面这张简单的图来大概了解下Cloudera Manager是如何去管理Hadoop服务的。
配置Cloudera Manager的YUM源
在/etc/yum.repos.d/下加一个文件CM.repo,内容如下.
[CM] name=CM baseurl=http://archive-primary.cloudera.com/cm5/RedHat/6/x86_64/cm/5.1.3/ gpgcheck=0
[ CM ] name = CM baseurl = http : //archive-primary.cloudera.com/cm5/redhat/6/x86_64/cm/5.1.3/ gpgcheck = 0 |
运行下面的命令查看Yum仓库配置成功与否,如果yum list能列出cloudera-manager-***那些包,就说明成功了。如果你是在实验室环境里安装,可能无法连接到外网,那么你可以在一台可以连接外网的机器上用reposync命令把外网的yum源clone到你本地,然后本地起个http服务给实验室的机器当yum服务器。也可以在实验室的CM.repo文件配置代理来连接外网,具体步骤Google一下就行了。
[root@bogon yum.repos.d]# yum clean all Loaded plugins: fastestmirror, refresh-packagekit, security Cleaning repos: CM base extras updates Cleaning up Everything Cleaning up list of fastest mirrors [root@bogon yum.repos.d]# yum list | grep cloudera cloudera-manager-daemons.x86_64 5.1.3-1.cm513.p0.155.el6 @CM cloudera-manager-server.x86_64 5.1.3-1.cm513.p0.155.el6 @CM cloudera-manager-agent.x86_64 5.1.3-1.cm513.p0.155.el6 CM cloudera-manager-server-db-2.x86_64 5.1.3-1.cm513.p0.155.el6 CM
[ root @ bogon yum . repos . d ] # yum clean all Loaded plugins : fastestmirror , refresh - packagekit , security Cleaning repos : CM base extras updates Cleaning up Everything Cleaning up list of fastest mirrors [ root @ bogon yum . repos . d ] # yum list | grep cloudera cloudera - manager - daemons . x86_64 5.1.3 - 1.cm513.p0.155.el6 @ CM cloudera - manager - server . x86_64 5.1.3 - 1.cm513.p0.155.el6 @ CM cloudera - manager - agent . x86_64 5.1.3 - 1.cm513.p0.155.el6 CM cloudera - manager - server - db - 2.x86_64 5.1.3 - 1.cm513.p0.155.el6 CM |
安装Cloudera Manager相关的包, 安装这个包时会自动把它依赖的cloudera-manager-daemons包也装上。
yum install cloudera-manager-server
yum install cloudera - manager - server |
看看这两个包到底装些什么,可以看到server基本上就复制了一些配置文件.真正的管理网站是通过daemons装的,具体的文件都在/usr/share/cmf目录下,日志在/var/log/cloudera-scm-server目录下,而运行时的配置文件在/var/run/cloudera-scm-server
[root@bogon ~]# rpm -ql cloudera-manager-server /etc/cloudera-scm-server /etc/cloudera-scm-server/db.properties /etc/cloudera-scm-server/log4j.properties /etc/default/cloudera-scm-server /etc/rc.d/init.d/cloudera-scm-server /etc/security/limits.d/cloudera-scm.conf /opt/cloudera/csd /opt/cloudera/parcel-repo /usr/sbin/cmf-server /var/lib/cloudera-scm-server [root@bogon ~]# rpm -ql cloudera-manager-daemons ... /usr/share/cmf/yarn-fixtures/details_1.json /usr/share/cmf/yarn-fixtures/details_2.json /usr/share/cmf/yarn-fixtures/details_3.json /usr/share/cmf/yarn-fixtures/details_4.json /usr/share/cmf/yarn-fixtures/details_5.json /usr/share/cmf/yarn-fixtures/details_6.json /usr/share/cmf/yarn-fixtures/details_7.json /usr/share/cmf/yarn-fixtures/details_8.json /usr/share/cmf/yarn-fixtures/details_9.json /var/log/cloudera-scm-server /var/run/cloudera-scm-server
[ root @ bogon ~ ] # rpm -ql cloudera-manager-server / etc / cloudera - scm - server / etc / cloudera - scm - server / db . properties / etc / cloudera - scm - server / log4j . properties / etc / default / cloudera - scm - server / etc / rc . d / init . d / cloudera - scm - server / etc / security / limits . d / cloudera - scm . conf / opt / cloudera / csd / opt / cloudera / parcel - repo / usr / sbin / cmf - server / var / lib / cloudera - scm - server [ root @ bogon ~ ] # rpm -ql cloudera-manager-daemons . . . / usr / share / cmf / yarn - fixtures / details_1 . json / usr / share / cmf / yarn - fixtures / details_2 . json / usr / share / cmf / yarn - fixtures / details_3 . json / usr / share / cmf / yarn - fixtures / details_4 . json / usr / share / cmf / yarn - fixtures / details_5 . json / usr / share / cmf / yarn - fixtures / details_6 . json / usr / share / cmf / yarn - fixtures / details_7 . json / usr / share / cmf / yarn - fixtures / details_8 . json / usr / share / cmf / yarn - fixtures / details_9 . json / var / log / cloudera - scm - server / var / run / cloudera - scm - server |
安装数据库和对应的driver,这是因为Cloudera Manager本身需要把配置保存数据库中,另外后面装Hive时也需要数据库来保存元数据。
yum install mysql-server yum install mysql-connector-java.noarch
yum install mysql - server yum install mysql - connector - java . noarch |
接着配置一些数据库相关的参数,先在数据库添加数据库cmf,以及对应的用户cmf,还有密码。
mysql> create database cmf DEFAULT CHARACTER SET utf8; Query OK, 1 row affected (0.00 sec) mysql> grant all on cmf.* TO 'cmf'@'localhost' IDENTIFIED BY '123456'; Query OK, 0 rows affected (0.00 sec)
mysql > create database cmf DEFAULT CHARACTER SET utf8 ; Query OK , 1 row affected ( 0.00 sec ) mysql > grant all on cmf . * TO 'cmf' @ 'localhost' IDENTIFIED BY '123456' ; Query OK , 0 rows affected ( 0.00 sec ) |
然后修改 /etc/cloudera-scm-server/db.properties文件,配置Cloudera manager要访问哪个数据库,对应的用户名,密码等。
# The database type # Currently 'mysql', 'postgresql' and 'Oracle' are valid databases. com.cloudera.cmf.db.type=mysql # The database host # If a non standard port is needed, use 'hostname:port' com.cloudera.cmf.db.host=localhost # The database name com.cloudera.cmf.db.name=cmf # The database user<a href="http://linmingren.me/blog/wp-content/uploads/2014/09/cloudera-welcome.png"><img src="http://linmingren.me/blog/wp-content/uploads/2014/09/cloudera-welcome-1024x375.png" alt="" title="cloudera welcome" width="584" height="213" class="alignnone size-large wp-image-479" /></a> com.cloudera.cmf.db.user=cmf # The database user's password com.cloudera.cmf.db.password=123456
# The database type # Currently 'mysql', 'postgresql' and 'oracle' are valid databases. com . cloudera . cmf . db . type = mysql # The database host # If a non standard port is needed, use 'hostname:port' com . cloudera . cmf . db . host = localhost # The database name com . cloudera . cmf . db . name = cmf # The database user<a href="http://linmingren.me/blog/wp-content/uploads/2014/09/cloudera-welcome.png"><img src="http://linmingren.me/blog/wp-content/uploads/2014/09/cloudera-welcome-1024x375.png" alt="" title="cloudera welcome" width="584" height="213" class="alignnone size-large wp-image-479" /></a> com . cloudera . cmf . db . user = cmf # The database user's password com . cloudera . cmf . db . password = 123456 |
现在就可以启动cloudera-scm-server服务了, 用netstat命令看下7180端口有没有启动(从服务启动到端口开启在我的机器上大概要半分钟),有的话就可以在浏览器中输入localhost:7180来访问Cloudera Manger的web界面了 (默认用户密码是admin/admin),没有的话请看/var/log/cloudera-scm-server/cloudera-scm-server.log中的异常是什么。
service cloudera-scm-server restart
service cloudera - scm - server restart |
第一次登陆进去的首页是这