Linux高可用之Corosync+Pacemaker详解
HA完整结构:
安装配置高可用集群:
1、节点名称:集群每个节点的名称都得能互相解析
/etc/hosts
hosts中主机名的正反解析结果必须跟“uname -n”的结果保持一致;
2、时间必须得同步
使用网络时间服务器同步时间
3、并非必须:各节点间能基于ssh密钥认证通信;
安装:
[root@marvin heartbeat]# yum install corosync -y
配置:
[root@sherry heartbeat]# cd /etc/corosync/
[root@sherry corosync]# ls
corosync.conf.example corosync.conf.example.udpu service.d uidgid.d
[root@sherry corosync]# cp corosync.conf.example corosync.conf
[root@sherry corosync]# vim corosync.conf
compatibility: whitetank #是否兼容0.8以前的版本
totem {
version: 2 #通信协议
secauth: on #安全认证功能 off别人知道多波地址 就可以加入 最好开启
threads: 0 #0表示默认 认证时候并行线程
interface {
ringnumber: 0 #定义环号,防止心跳信息循环发送 就有一块网卡就用0
bindnetaddr: 192.168.1.0 #绑定网络地址
mcastaddr: 225.122.111.111 #224.0.1.0~238.255.255.255 建议用这组临时
mcastport: 5405 #多波端口
ttl:1 #只发一次 避免环路
}
}
logging {
fileline: off
to_stderr: no #标准错误输出
to_logfile: yes
logfile: /var/log/cluster/corosync.log
to_syslog: no #日志开启一项即可
debug: off
timestamp: no #是否记录时间戳
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode:disabled #编程相关
}
安装pacemaker:
[root@sherry corosync]# yum install pacemaker -y
自动启动pacemaker:(服务不会启动,还是需要手动启动)
[root@sherry corosync]# vim corosync.conf
service {
ver:1 #以插件方式运行pacemaker
name:pacemaker
}
aisexec{
user:root
group:root
}
密钥文件:
#生产环境请手敲密钥
[root@sherry corosync]# mv /dev/random /dev/random.bak
[root@sherry corosync]# mv /dev/urandom /dev/random
[root@sherry corosync]# corosync-keygen
[root@sherry corosync]# mv /dev/random /dev/urandom
[root@sherry corosync]# mv /dev/random.bak /dev/random
密钥生产:(权限400)
[root@sherry corosync]# ll
total 24
-r-------- 1 root root 128 May 31 20:54 authkey
-rw-r--r-- 1 root root 476 May 31 20:50 corosync.conf
-rw-r--r-- 1 root root 2663 May 11 06:27 corosync.conf.example
-rw-r--r-- 1 root root 1073 May 11 06:27 corosync.conf.example.udpu
drwxr-xr-x 2 root root 4096 May 11 06:27 service.d
drwxr-xr-x 2 root root 4096 May 11 06:27 uidgid.d
配置文件复制到对应节点:
[root@sherry corosync]# scp -P6789 -p authkey corosync.conf root@marvin:/etc/corosync/
crmsh安装:
[root@sherry yum.repos.d]# cd /etc/yum.repos.d/
[root@sherry yum.repos.d]# wget http://download.openSUSE.org/repositories/network:ha-clustering:Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo
[root@sherry yum.repos.d]# yum install crmsh -y
启动脚本:
[root@sherry corosync]# /etc/init.d/corosync start
[root@sherry corosync]# /etc/init.d/pacemaker start
#停止时候次序相反
实验服务器:marvin sherry
初始化:
corosync默认启用了stonith,而当前集群并没有相应的stonith设备 我们里可以通过如下命令先禁用stonith:
crm(live)# configure
crm(live)configure# property stonith-enabled=false
crm(live)configure# verify
crm(live)configure# commit
设置投票
crm(live)configure# property no-quorum-policy=ignore
crm(live)configure# verify
crm(live)configure# commit
查看配置所有信息:
crm(live)configure# show
node marvin
node sherry
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.14-8.el6-70404b0 \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false
no-quorum-policy=ignore
定义资源:
定义一个ip:
crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=192.168.1.199
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node marvin
node sherry
primitive webip IPaddr \
params ip=192.168.1.199
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.14-8.el6-70404b0 \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false
[root@marvin ~]# ip addr show eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:0c:34:2c brd ff:ff:ff:ff:ff:ff
inet 192.168.1.220/24 brd 192.168.1.255 scope global eth1
inet 192.168.1.199/24 brd 192.168.1.255 scope global secondary eth1
inet6 fe80::20c:29ff:fe0c:342c/64 scope link
valid_lft forever preferred_lft forever
后期定义监控:kill后会自动启动
crm(live)configure# monitor webserver 30s:15s #删除可直接edit 30s监控一次15s延迟
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
primitive webserver lsb:nginx \
meta target-role=Stopped \
op monitor interval=30s timeout=15s
nfs:(定义监控) (定义正确,未提交)
crm(live)configure# primitive webstore ocf:heartbeat:Filesystem params device="sherry:/nfsshared/node1" directory="/mnt/nfs/node1" fstype='nfs' op monitor interval=20s timeout=40s op start timeout=60s op stop timeout=60s on-fail=restart
crm(live)configure# verify
nginx:(定义监控)
crm(live)configure# primitive webserver lsb:nginx op monitor interval=30s timeout=15s on-fail=restart
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node marvin
node sherry
primitive webip IPaddr \
params ip=192.168.1.199
primitive webserver lsb:nginx
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.14-8.el6-70404b0 \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false
status:
crm(live)# status
Last updated: Wed Jun 1 20:38:32 2016 Last change: Wed Jun 1 20:36:32 2016 by root via cibadmin on sherry
Stack: classic openais (with plugin)
Current DC: marvin (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 2 resources configured, 2 expected votes
Online: [ marvin sherry ]
Full list of resources:
webip (ocf::heartbeat:IPaddr): Started marvin
webserver (lsb:nginx): Started sherry
资源停止:
crm(live)# resource
crm(live)resource# stop webserver
crm(live)resource# status
webip (ocf::heartbeat:IPaddr): Started
webserver (lsb:nginx): (target-role:Stopped) Stopped
清理资源状态:
crm(live)resource# cleanup webserver
Cleaning up webserver on marvin, removing fail-count-webserver
Cleaning up webserver on sherry, removing fail-count-webserver
* The configuration specifies that 'webserver' should remain stopped
Waiting for 2 replies from the CRMd.. OK
组操作:
先定义好资源,在加入组
crm(live)# status
Last updated: Wed Jun 1 20:38:32 2016 Last change: Wed Jun 1 20:36:32 2016 by root via cibadmin on sherry
Stack: classic openais (with plugin)
Current DC: marvin (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 2 resources configured, 2 expected votes
Online: [ marvin sherry ]
Full list of resources:
webip (ocf::heartbeat:IPaddr): Started marvin
webserver (lsb:nginx): Started sherry
crm(live)# configure
crm(live)configure# group webservice webip webserver
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node marvin
node sherry
primitive webip IPaddr \
params ip=192.168.1.199
primitive webserver lsb:nginx
group webservice webip webserver
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.14-8.el6-70404b0 \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false
删除组:
crm(live)resource# stop webservice
crm(live)configure# delete webservice #组下面的资源还是存在
节点操作:
节点离线:
crm(live)# node
crm(live)node# standby marvin #资源自动转移
节点上线:
crm(live)node# online marvin
节点清理:(节点上资源信息清理)
crm(live)node# clearstate marvin
位置约束:
绑定在一起:
crm(live)configure# colocation webserver_and_webip inf: webserver webip
crm(live)configure# verify
crm(live)configure# commit
查看
crm(live)configure# show
node marvin \
attributes standby=off
node sherry
primitive webip IPaddr \
params ip=192.168.1.199
primitive webserver lsb:nginx
colocation webserver_and_webip inf: webserver webip
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.14-8.el6-70404b0 \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
stonith-enabled=false
具体查看:
crm(live)configure# show xml
<rsc_colocation id="webserver_and_webip" score="INFINITY" rsc="webserver" with-rsc="webip"/> #webserver跟着webip
顺序约束:
crm(live)configure# order webip-before-webserver mandatory: webip webserver #依次顺序
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# status
order webip-before-webserver Mandatory: webip webserver
crm(live)configure# show xml
<rsc_order id="webip-before-webserver" kind="Mandatory" first="webip" then="webserver"/>
位置约束
crm(live)configure# location webip_on_marvin webip 200: marvin
crm(live)configure# verify
crm(live)configure# commit
查看:
crm(live)# status
Last updated: Wed Jun 1 21:11:58 2016 Last change: Wed Jun 1 21:11:32 2016 by root via cibadmin on sherry
Stack: classic openais (with plugin)
Current DC: marvin (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 2 resources configured, 2 expected votes
Online: [ marvin sherry ]
Full list of resources:
webip (ocf::heartbeat:IPaddr): Started marvin
webserver (lsb:nginx): Started marvin
over