hdfs、hive、hbase的搭建总结
jdk的安装
上传jdk的安装包到linux中
解压、更名
[ software]# tar -zxvf jdk-8u221-linux-x64.tar.gz -C /usr/local/ [ software]# cd /usr/local [ local]# mv jdk1.8.0_221/ jdk
环境变量的配置
[ local]# vi /etc/profile ......省略......... # java environment JAVA_HOME=/usr/local/jdk PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH export JAVA_HOME PATH
重新加载配置文件
[ local]# source /etc/profile
验证是否配置成功
[ local]# java -version [ local]# javac
hdfs完全分布式的搭建
上传并解压hadoop
[ ~]# tar -zxvf hadoop-2.7.6.tar.gz -C /usr/local/
更名
[ ~]# cd /usr/local [ local]# mv hadoop-2.7.6/ hadoop
环境变量的配置
local]# vi /etc/profile .........省略.......... #hadoop environment export HADOOP_HOME=/usr/local/hadoop export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
重新加载配置文件
[ local]# source /etc/profile
验证是否装好
[ local]# hadoop version
布局
qianfeng01: namenode datanode resourcemanager nodemanager qianfeng02: secondarynamenode datanode nodemanager qianfeng03:
配置core-site.xml
<!-- 完全分布式文件系统的名称 :schema ip port --> <property> <name>fs.defaultFS</name> <value>hdfs://qianfeng01/</value> </property> <!-- 分布式文件系统的其他路径的所依赖的一个基础路径,完全分布式不能使用默认值,因为临路径不安全,linux系统在重启时,可能会删除此目录下的内容--> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/tmp</value> </property>
配置hdfs-site.xml
<!-- namenode守护进程所管理文件的存储路径 --> <property> <name>dfs.namenode.name.dir</name> <value>file://${hadoop.tmp.dir}/dfs/name</value> </property> <!-- datanode守护进程所管理文件的存储路径 --> <property> <name>dfs.datanode.data.dir</name> <value>file://${hadoop.tmp.dir}/dfs/data</value> </property> <!-- hdfs的块的副本数 --> <property> <name>dfs.replication</name> <value>3</value> </property> <!-- hdfs的块大小,默认是128M --> <property> <name>dfs.blocksize</name> <value>134217728</value> </property> <!-- secondarynamenode的http服务的ip和port --> <property> <name>dfs.namenode.secondary.http-address</name> <value>qianfeng02:50090</value> </property>
配置mapred-site.xml
<!-- mapreduce程序运行时所使用的框架的名称--> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!-- mapreduce程序运行的历史服务器的ip和port--> <property> <name>mapreduce.jobhistory.address</name> <value>qianfeng01:10020</value> </property> <!-- mapreduce程序运行的webui的ip和port--> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>qianfeng01:19888</value> </property>
配置yarn-site.xml
<!-- 配置yarn框架使用其核心技术:shuffle--> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 配置resourcemanager所在的主机的名称--> <property> <description>The hostname of the RM.</description> <name>yarn.resourcemanager.hostname</name> <value>qianfeng01</value> </property>
配置hadoop-env.sh
[ hadoop]# vi hadoop-env.sh ......... # The java implementation to use. export JAVA_HOME=/usr/local/jdk
配置yarn-env.sh
[ hadoop]# vi yarn-env.sh ......... # some Java parameters export JAVA_HOME=/usr/local/jdk
配置slaves文件(重点)
datanode守护进程所在主机的主机名称 [ hadoop]# vi slaves qianfeng01 qianfeng02 qianfeng03
免密登陆
- 生产密钥
ssh-keygen -t rsa 一路回车即可
分发密钥,分发给自己就行了,克隆后再ssh
语法格式:ssh-copy-id -i 公钥文件 远程用户名@远程机器IP 作用:将本机当前用户的公钥文件,复制到远程机器的相关用户的主目录的隐藏目录.ssh下,同时自动更名为authorised_keys. 注意:.ssh目录的权限是700 authorised_keys的权限是600
firewalld和NetworkManager以及selinux的关闭
查看服务状态: systemctl status firewalld 临时关闭: systemctl stop firewalld 临时启动: systemctl start firewalld 设置开机不启动: systemctl disable firewalld #下次开机生效 设置开机启动: systemctl enable firewalld #下次开机生效 systemctl status NetworkManager systemctl start NetworkManager systemctl stop NetworkManager systemctl disable NetworkManager systemctl enable NetworkManager [ ~]# vi /etc/selinux/config # This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - No SELinux policy is loaded. SELINUX=enforcing <---- 需要将enforcing改为disabled # SELINUXTYPE= can take one of three values: # targeted - Targeted processes are protected, # minimum - Modification of targeted policy. Only selected processes are protected. # mls - Multi Level Security protection. SELINUXTYPE=targeted
克隆虚拟机
修改主机名
]# hostnamectl set-hostname qianfeng02
修改IP
[ ~]# vi /etc/sysconfig/network-scripts/ifcfg-ens33 TYPE=Ethernet BOOTPROTO=static NAME=ens33 DEVICE=ens33 ONBOOT=yes IPADDR=192.168.10.102 <--- 修改ip地址即可,别的都不用动 NETMASK=255.255.255.0 GATEWAY=192.168.10.2 DNS1=192.168.10.2 DNS2=8.8.8.8 DNS3=114.114.114.114
重启网络并检查ip
systemctl restart network ip addr ping外网 ping主机 主机ping虚拟机mysql的安装
mysql的安装
上传安装包,解压
[ ~]# tar -xvf mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar -C /usr/local/mysql mysql-community-embedded-5.7.28-1.el7.x86_64.rpm mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm mysql-community-devel-5.7.28-1.el7.x86_64.rpm mysql-community-embedded-compat-5.7.28-1.el7.x86_64.rpm mysql-community-libs-5.7.28-1.el7.x86_64.rpm mysql-community-test-5.7.28-1.el7.x86_64.rpm mysql-community-common-5.7.28-1.el7.x86_64.rpm mysql-community-embedded-devel-5.7.28-1.el7.x86_64.rpm mysql-community-client-5.7.28-1.el7.x86_64.rpm mysql-community-server-5.7.28-1.el7.x86_64.rpm
安装mysql所依赖的环境perl,移除mysql的冲突软件mariadb
[ ~]# yum -y install perl [ ~]# yum -y install net-tools [ ~]# rpm -qa | grep mariadb mariadb-libs-5.5.64-1.el7.x86_64 [ ~]# rpm -e mariadb-libs-5.5.64-1.el7.x86_64 --nodeps
按照mysql的依赖顺序来安装mysql的rpm包
[ ~]# rpm -ivh mysql-community-common-5.7.28-1.el7.x86_64.rpm [ ~]# rpm -ivh mysql-community-libs-5.7.28-1.el7.x86_64.rpm [ ~]# rpm -ivh mysql-community-client-5.7.28-1.el7.x86_64.rpm [ ~]# rpm -ivh mysql-community-server-5.7.28-1.el7.x86_64.rpm
启动mysql的服务项,并检查状态
[ ~]# systemctl start mysqld [ ~]# systemctl status mysqld ● mysqld.service - MySQL Server Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled) Active: active (running) since 五 2020-05-29 11:25:57 CST; 9s ago Docs: man:mysqld(8) http://dev.mysql.com/doc/refman/en/using-systemd.html Process: 2406 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=0/SUCCESS) Process: 2355 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS) Main PID: 2409 (mysqld) CGroup: /system.slice/mysqld.service └─2409 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid 5月 29 11:25:52 qianfeng01 systemd[1]: Starting MySQL Server... 5月 29 11:25:57 qianfeng01 systemd[1]: Started MySQL Server.
查询mysql的初始密码(密码保存在/var/log/mysqld.log文件中,此文件是服务项启动后生成的)
[ ~]# cat /var/log/mysqld.log | grep password
使用初始密码进行登录
[ ~]# mysql -uroot -p‘密码‘
登录成功后,要降低密码策略机制,改为low,也可以将密码长度6.
set global validate_password_policy=low; set global validate_password_length=6; 查看密码策略,是否修改成功 show variables like ‘%validate_password%‘;
修改密码
alter user identified by ‘新密码‘
如果想要远程连接mysql,需要进行远程授权操作(注意,一定要关闭虚拟机防火墙)
*.*:所有库下的所有表 “%”:root下的所有ip grant all privileges on *.* to "%" identified by ‘111111‘ with grant option;
hive的安装
上传解压,更名
[ local]# tar -zxvf apache-hive-2.1.1-bin.tar.gz -C /usr/local [ local]# mv apache-hive-2.1.1-bin/ hive
环境变量的配置
[ local]# vi /etc/profile # 添加如下内容: export HIVE_HOME=/usr/local/hive export PATH=$HIVE_HOME/bin:$PATH #让profile生效 [ local ]# source /etc/profile
hive-env.sh
export HIVE_CONF_DIR=/usr/local/hive/conf export JAVA_HOME=/usr/local/jdk export HADOOP_HOME=/urs/local/hadoop export HIVE_AUX_JARS_PATH=/usr/local/hive/lib
hive-site.xml
<!--hive仓库在hdfs的位置--> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> <description>location of default database for the warehouse</description> </property> <!-- 该参数主要指定Hive的临时文件存储目录 --> <property> <name>hive.exec.scratchdir</name> <value>/tmp/hive</value> </property> <!--连接mysql的url地址--> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://qianfeng03:3306/hive?createDatabaseIfNotExist=true&characterEncoding=latin1</value> </property> <!--mysql的驱动类--> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <!--mysql的用户名--> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <!--mysql远程登陆的密码--> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>111111</value> </property> <!--hive工作的本地临时存储空间--> <property> <name>hive.exec.local.scratchdir</name> <value>/usr/local/hive/iotmp/root</value> </property> <!--如果启用了日志功能,则存储操作日志的顶级目录--> <property> <name>hive.server2.logging.operation.log.location</name> <value>/usr/local/hive/iotmp/root/operation_logs</value> </property> <!--Hive运行时结构化日志文件的位置--> <property> <name>hive.querylog.location</name> <value>/usr/local/hive/iotmp/root</value> </property> <!--用于在远程文件系统中添加资源的临时本地目录--> <property> <name>hive.downloaded.resources.dir</name> <value>/usr/local/hive/iotmp/${hive.session.id}_resources</value> </property> 说明:使用远程模式,需要在hadoop的core-site.xml文件中添加一下属性 <property> <name>hadoop.proxyuser.root.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.root.groups</name> <value>*</value> </property>
hbase完全分布式
安装zookeeper
上传并解压更名
[ ~]# tar -zxvf zookeeper-3.4.10.tar.gz -C /usr/local/ [ ~]# cd /usr/local/ [ local]# mv zookeeper-3.4.10 zookeeper
配置环境变量
[ local]# vi /etc/profile .........省略...... export ZOOKEEPER_HOME=/usr/local/zookeeper export PATH=$ZOOKEEPER_HOME/bin:$PATH
生效
[ local]# source /etc/profile
验证配置是否成功:使用tab键看看是否可以提示zookeeper相关脚本
进入conf目录下,复制一个zoo.cfg文件
cp zoo_sample.cfg zoo.cfg
修改zoo.cfg文件
dataDir=/usr/local/zookeeper/zkData clientPort=2181 server.1=qianfeng01:2888:3888 server.2=qianfeng02:2888:3888 server.3=qianfeng03:2888:3888
如果dataDir属性指定的目录不存在,那么要创建出来
mkdir /usr/local/zookeeper/zkData
在zkData目录下创建myid写入相应数字
将/etc/profile文件和zookeeper目录scp到其他机器上
启动zookeeper集群,每台机器上都要运行一下命令
zkServer.sh start zkServer.sh status
安装hbase
上传解压更名
[ software]# tar -zxvf hbase-1.2.1-bin.tar.gz -C /opt/apps/
配置hbase-env.sh
[ conf]# vi hbase-env.sh # The java implementation to use. Java 1.7+ required. export JAVA_HOME=/opt/apps/jdk1.8.0_45 # Tell HBase whether it should manage it‘s own instance of Zookeeper or not. export HBASE_MANAGES_ZK=true
hbase-site.xml
<configuration> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.rootdir</name> <value>hdfs://qphone01:9000/hbase</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>qphone01,qphone02,qphone03</value> </property> </configuration>
regionservers
qphone01 qphone02 qphone03
将/etc/profile文件和hbase目录scp到其他机器上
在qphone01的hbase的conf目录下创建backup-masters,写入qphone02做备用机器
qphone02
测试
[ apps]# start-hbase.sh http://192.168.49.200:16010/master-status
jps查看进程
- HMaster//必须的,表示hbase正常,因为配置了两个master一个是active状态一个是备用状态,所以在qianfeng01和qiangfeng02上各有一个hmaster
- QuorumPeerMain//必须单独配置的Zookeeper集群,如果是内置的则为HQuorumPeer,表示zookeeper正常
进入hbase的shell
hbase shell