Linux高可用性方案之Heartbeat日志查看
相关阅读:
下面跟着笔者我们来看详细看下Heartbeat的日志
启动主机Heartbeat服务
#/etc/init.d/heartbeat start
Heartbeat启动时,通过"tail -f /var/log/ messages"查看主节点系统日志信息,输出如下:
# tail -f /var/log/messages
Nov 26 07:52:21 node1 heartbeat: [3688]: info:
Configuration validated. Starting heartbeat 2.0.8
Nov 26 07:52:21 node1 heartbeat: [3689]: info:
heartbeat: version 2.0.8
Nov 26 07:52:21 node1 heartbeat: [3689]: info:
Heartbeat generation: 3
Nov 26 07:52:21 node1 heartbeat: [3689]: info:
G_main_add_TriggerHandler: Added signal manual handler
Nov 26 07:52:21 node1 heartbeat: [3689]: info:
G_main_add_TriggerHandler: Added signal manual handler
Nov 26 07:52:21 node1 heartbeat: [3689]: info:
glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1
Nov 26 07:52:21 node1 heartbeat: [3689]: info:
glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1
Nov 26 07:52:21 node1 heartbeat: [3689]: info:
glib: ping heartbeat started.
Nov 26 07:52:21 node1 heartbeat: [3689]: info:
G_main_add_SignalHandler: Added signal handler for signal 17
Nov 26 07:52:21 node1 heartbeat: [3689]: info:
Local status now set to: 'up'
Nov 26 07:52:22 node1 heartbeat: [3689]: info:
Link node1:eth1 up.
Nov 26 07:52:23 node1 heartbeat: [3689]: info:
Link 192.168.60.1:192.168.60.1 up.
Nov 26 07:52:23 node1 heartbeat: [3689]: info:
Status update for node 192.168.60.1: status ping
此段日志是Heartbeat在进行初始化配置,例如,Heartbeat的心跳时间间隔、UDP广播端口和ping节点的运行状态等,日志信息到这里会暂停,等待120秒之后,Heartbeat会继续输出日志,而这个120秒刚好是ha.cf中"initdead"选项的设定时间。此时Heartbeat的输出信息如下:
Nov 26 07:54:22 node1 heartbeat: [3689]: WARN: node node2: is dead
Nov 26 07:54:22 node1 heartbeat: [3689]: info:
Comm_now_up(): updating status to active
Nov 26 07:54:22 node1 heartbeat: [3689]: info:
Local status now set to: 'active'
Nov 26 07:54:22 node1 heartbeat: [3689]: info:
Starting child client "/usr/lib/heartbeat/ipfail" (694,694)
Nov 26 07:54:22 node1 heartbeat: [3689]: WARN:
No STONITH device configured.
Nov 26 07:54:22 node1 heartbeat: [3689]: WARN:
Shared disks are not protected.
Nov 26 07:54:22 node1 heartbeat: [3689]: info:
Resources being acquired from node2.
Nov 26 07:54:22 node1 heartbeat: [3712]: info:
Starting "/usr/lib/heartbeat/ipfail" as uid 694 gid 694 (pid 3712)
在上面这段日志中,由于node2还没有启动,因此会给出"node2: is dead"的警告信息,接下来启动了Heartbeat插件ipfail。由于我们在ha.cf文件中没有配置STONITH,因此日志里也给出了"No STONITH device configured"的警告提示。
继续看下面的日志:
Nov 26 07:54:23 node1 harc[3713]: info: Running /etc/ha.d/rc.d/status status
Nov 26 07:54:23 node1 mach_down[3735]: info: /usr/lib/
heartbeat/mach_down: nice_failback: foreign resources acquired
Nov 26 07:54:23 node1 mach_down[3735]: info: mach_down
takeover complete for node node2.
Nov 26 07:54:23 node1 heartbeat: [3689]: info: mach_down takeover complete.
Nov 26 07:54:23 node1 heartbeat: [3689]: info: Initial
resource acquisition complete (mach_down)
Nov 26 07:54:24 node1 IPaddr[3768]: INFO: Resource is stopped
Nov 26 07:54:24 node1 heartbeat: [3714]: info: Local Resource
acquisition completed.
Nov 26 07:54:24 node1 harc[3815]: info: Running /etc/ha.
d/rc.d/ip-request-resp ip-request-resp
Nov 26 07:54:24 node1 ip-request-resp[3815]: received ip-
request-resp 192.168.60.200/24/eth0 OK yes
Nov 26 07:54:24 node1 ResourceManager[3830]: info: Acquiring
resource group: node1 192.168.60.200/24/eth0 Filesystem:
:/dev/sdb5::/webdata::ext3
Nov 26 07:54:24 node1 IPaddr[3854]: INFO: Resource is stopped
Nov 26 07:54:25 node1 ResourceManager[3830]: info: Running
/etc/ha.d/resource.d/IPaddr 192.168.60.200/24/eth0 start
Nov 26 07:54:25 node1 IPaddr[3932]: INFO: Using calculated
netmask for 192.168.60.200: 255.255.255.0
Nov 26 07:54:25 node1 IPaddr[3932]: DEBUG: Using calculated
broadcast for 192.168.60.200: 192.168.60.255
Nov 26 07:54:25 node1 IPaddr[3932]: INFO: eval /sbin/ifconfig
eth0:0 192.168.60.200 netmask 255.255.255.0 broadcast 192.168.60.255
Nov 26 07:54:25 node1 avahi-daemon[1854]: Registering new
address record for 192.168.60.200 on eth0.
Nov 26 07:54:25 node1 IPaddr[3932]: DEBUG: Sending Gratuitous
Arp for 192.168.60.200 on eth0:0 [eth0]
Nov 26 07:54:26 node1 IPaddr[3911]: INFO: Success
Nov 26 07:54:26 node1 Filesystem[4021]: INFO: Resource is stopped
Nov 26 07:54:26 node1 ResourceManager[3830]: info: Running
/etc/ha.d/resource.d/ Filesystem/dev/sdb5 /webdata ext3 start
Nov 26 07:54:26 node1 Filesystem[4062]: INFO: Running start
for /dev/sdb5 on /webdata
Nov 26 07:54:26 node1 kernel: kjournald starting. Commit interval 5 seconds
Nov 26 07:54:26 node1 kernel: EXT3 FS on sdb5, internal journal
Nov 26 07:54:26 node1 kernel: EXT3-fs: mounted
filesystem with ordered data mode.
Nov 26 07:54:26 node1 Filesystem[4059]: INFO:
Success
Nov 26 07:54:33 node1 heartbeat: [3689]: info:
Local Resource acquisition completed. (none)
Nov 26 07:54:33 node1 heartbeat: [3689]: info:
local resource transition completed
上面这段日志是进行资源的监控和接管,主要完成haresources文件中的设置,在这里是启用集群虚拟IP和挂载磁盘分区。
此时,通过ifconfig命令查看主节点的网络配置,可以看到,主节点已经自动绑定集群IP地址,在HA集群之外的主机上通过ping命令检测集群IP地址192.168.60.200,已经处于可通状态,也就是该地址变得可用。
同时查看磁盘分区的挂载情况,共享磁盘分区/dev/sdb5已经被自动挂载。