Mysql GTID Mha配置方法
Gtid + Mha +Binlog server配置:
1:测试环境
OS:CentOS 6.5
Mysql:5.6.28
Mha:0.56
192.168.1.21 mysql1 M1
192.168.1.22 mysql2 S1
192.168.1.23 mysql3 S2 Mha manage、Binlog server
2:配置/etc/my.cnf相关参数,在3各节点中分别配置
binlog-format=ROW log-slave-updates=true gtid-mode=on enforce-gtid-consistency=true master-info-repository=TABLE relay-log-info-repository=TABLE sync-master-info=1 slave-parallel-workers=2 binlog-checksum=CRC32 master-verify-checksum=1 slave-sql-verify-checksum=1 binlog-rows-query-log_events=1
设置root密码,创建复制用户:
mysql> use mysql; mysql> GRANT ALL PRIVILEGES ON *.* TO root@"%" IDENTIFIED BY "oracle123"; mysql> update user set Password = password('oracle123') where User='root'; mysql> flush privileges; mysql> GRANT replication slave ON *.* TO 'repl'@'%' identified by 'oracle'; mysql> flush privileges;
3:在mysql2、mysql3配置Gtid复制
CHANGE MASTER TO MASTER_HOST = '192.168.1.21', MASTER_PORT = 3306, MASTER_USER = 'repl', MASTER_PASSWORD = 'oracle', MASTER_AUTO_POSITION = 1; start slave; mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.1.21 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000003 Read_Master_Log_Pos: 524 Relay_Log_File: mysql-relay-bin.000002 Relay_Log_Pos: 734 Relay_Master_Log_File: mysql-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: ...... Master_SSL_Crlpath: Retrieved_Gtid_Set: 9ee7c7af-cbf3-11e5-bf75-000c2923e459:1-2 Executed_Gtid_Set: 9ee7c7af-cbf3-11e5-bf75-000c2923e459:1-2 Auto_Position: 1 1 row in set (0.00 sec)
4:安装Mha
rpm -Uvh epel-release-6-8.noarch.rpm
配置SSH等效:
在所有节点都执行
ssh-keygen -t rsa ssh-copy-id -i /root/.ssh/id_rsa.pub root@mysql1 ssh-copy-id -i /root/.ssh/id_rsa.pub root@mysql2 ssh-copy-id -i /root/.ssh/id_rsa.pub root@mysql3
测试ssh登录,在3各节点分别测试:
ssh myqsl1 ssh myqsl2 ssh myqsl3
binlog server配置:在mysql3
mkdir -p /mysql/backup/binlog /usr/local/mysql/bin/mysqlbinlog -R --raw --host=192.168.1.20 --user='root' --password='oracle123' --stop-never mysql- bin.000003 &
最后那个binlog文件时给定从那个binlog文件开始。另外需要注意,当mysql1上的mysql进程退出后,binlog server也会退出。
需要安装一些包做支持,使用yum网络源;如安装遇到问题可以尝试yum update更新yum源或yum clean all清除缓存
在每个节点安装 mha4mysql-node
yum -y install perl-DBD-MySQL ncftp
rpm -Uvh mha4mysql-node-0.56-0.el6.noarch.rpm
在mysql3上安装mha-manager
yum install perl yum install cpan yum install perl-Config-Tiny yum install perl-Time-HiRes yum install perl-Log-Dispatch yum install perl-Parallel-ForkManager
如果安装perl-Log-Dispatch,perl-Parallel-ForkManager安装包报错:
需要先安装epel(可以参考https://fedoraproject.org/wiki/EPEL)
rpm -Uvh mha4mysql-manager-0.56-0.el6.noarch.rpm
5:配置Mha,在mysql3
mkdir -p /etc/masterha/app1 vi /etc/masterha/app1.cnf [server default] user=root password=oracle123 manager_workdir=/etc/masterha/app1 manager_log=/etc/masterha/app1/manager.log remote_workdir=/etc/masterha/app1 ssh_user=root repl_user=repluser repl_password=oracle ping_interval=3 master_ip_failover_script=/etc/masterha/app1/master_ip_failover [server1] hostname=192.168.1.21 #ssh_port=9999 master_binlog_dir=/mysql/logs check_repl_delay=0 #防止master故障时候,切换时slave有延迟,可在那里切不过来 candidate_master=1 [server2] hostname=192.168.1.22 #ssh_port=9999 master_binlog_dir=/mysql/logs candidate_master=1 [server3] hostname=192.168.1.23 #ssh_port=9999 master_binlog_dir=/mysql/logs no_master=1 ignore_fail=1 #如果这个节点挂了,mha将不可用,加上这个参数slave挂了一样可以用 [binlog1] #binlog server需要mysqlbinlog命令 hostname=192.168.1.23 master_binlog_dir=/mysql/backup/binlog #读取binlog存放位置 ignore_fail=1 no_master=1 vi /etc/masterha/app1/master_ip_failover #!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port ); my $vip = '192.168.1.20';#Virtual IP my $gateway = '192.168.1.1';#Gateway IP my $interface = 'eth0'; my $key = "1"; my $ssh_start_vip = "/sbin/ifconfig $interface:$key $vip;/sbin/arping -I $interface -c 3 -s $vip $gateway >/dev/null 2>&1"; my $ssh_stop_vip = "/sbin/ifconfig $interface:$key down"; GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, ); exit &main(); sub main { print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed. # If you manage master ip address at global catalog database, # invalidate orig_master_ip here. my $exit_code = 1; eval { print "Disabling the VIP on old master: $orig_master_host \n"; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { # all arguments are passed. # If you manage master ip address at global catalog database, # activate new_master_ip here. # You can also grant write access (create user, set read_only=0, etc) here. my $exit_code = 10; eval { print "Enabling the VIP - $vip on the new master - $new_master_host \n"; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK \n"; `ssh $ssh_user\@$orig_master_host \" $ssh_start_vip \"`; exit 0; } else { &usage(); exit 1; } } # A simple system call that enable the VIP on the new master sub start_vip() { `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; } # A simple system call that disable the VIP on the old_master sub stop_vip() { `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip -- orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; } chmod 777 /etc/masterha/app1/
配置文件测试:
# masterha_check_ssh --conf=/etc/masterha/app1.cnf Thu May 26 23:25:35 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu May 26 23:25:35 2016 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu May 26 23:25:35 2016 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu May 26 23:25:35 2016 - [info] Starting SSH connection tests.. Thu May 26 23:25:35 2016 - [debug] Thu May 26 23:25:35 2016 - [debug] Connecting via SSH from [email protected](192.168.1.21:22) to [email protected](192.168.1.22:22).. Thu May 26 23:25:35 2016 - [debug] ok. Thu May 26 23:25:35 2016 - [debug] Connecting via SSH from [email protected](192.168.1.21:22) to [email protected](192.168.1.23:22).. Thu May 26 23:25:35 2016 - [debug] ok. Thu May 26 23:25:36 2016 - [debug] Thu May 26 23:25:35 2016 - [debug] Connecting via SSH from [email protected](192.168.1.22:22) to [email protected](192.168.1.21:22).. Thu May 26 23:25:35 2016 - [debug] ok. Thu May 26 23:25:35 2016 - [debug] Connecting via SSH from [email protected](192.168.1.22:22) to [email protected](192.168.1.23:22).. Thu May 26 23:25:36 2016 - [debug] ok. Thu May 26 23:25:36 2016 - [debug] Thu May 26 23:25:36 2016 - [debug] Connecting via SSH from [email protected](192.168.1.23:22) to [email protected](192.168.1.21:22).. Thu May 26 23:25:36 2016 - [debug] ok. Thu May 26 23:25:36 2016 - [debug] Connecting via SSH from [email protected](192.168.1.23:22) to [email protected](192.168.1.22:22).. Thu May 26 23:25:36 2016 - [debug] ok. Thu May 26 23:25:36 2016 - [info] All SSH connection tests passed successfully. #masterha_check_repl --conf=/etc/masterha/app1.cnf Thu May 26 22:52:30 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu May 26 22:52:30 2016 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu May 26 22:52:30 2016 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu May 26 22:52:30 2016 - [info] MHA::MasterMonitor version 0.56. Thu May 26 22:52:31 2016 - [info] GTID failover mode = 1 Thu May 26 22:52:31 2016 - [info] Dead Servers: Thu May 26 22:52:31 2016 - [info] Alive Servers: Thu May 26 22:52:31 2016 - [info] 192.168.1.21(192.168.1.21:3306) Thu May 26 22:52:31 2016 - [info] 192.168.1.22(192.168.1.22:3306) Thu May 26 22:52:31 2016 - [info] 192.168.1.23(192.168.1.23:3306) Thu May 26 22:52:31 2016 - [info] Alive Slaves: Thu May 26 22:52:31 2016 - [info] 192.168.1.22(192.168.1.22:3306) Version=5.6.28-log (oldest major version between slaves) log-bin:enabled Thu May 26 22:52:31 2016 - [info] GTID ON Thu May 26 22:52:31 2016 - [info] Replicating from 192.168.1.21(192.168.1.21:3306) Thu May 26 22:52:31 2016 - [info] Primary candidate for the new Master (candidate_master is set) Thu May 26 22:52:31 2016 - [info] 192.168.1.23(192.168.1.23:3306) Version=5.6.28-log (oldest major version between slaves) log-bin:enabled Thu May 26 22:52:31 2016 - [info] GTID ON Thu May 26 22:52:31 2016 - [info] Replicating from 192.168.1.21(192.168.1.21:3306) Thu May 26 22:52:31 2016 - [info] Not candidate for the new Master (no_master is set) Thu May 26 22:52:31 2016 - [info] Current Alive Master: 192.168.1.21(192.168.1.21:3306) Thu May 26 22:52:31 2016 - [info] Checking slave configurations.. Thu May 26 22:52:31 2016 - [info] read_only=1 is not set on slave 192.168.1.22(192.168.1.22:3306). Thu May 26 22:52:31 2016 - [info] read_only=1 is not set on slave 192.168.1.23(192.168.1.23:3306). Thu May 26 22:52:31 2016 - [info] Checking replication filtering settings.. Thu May 26 22:52:31 2016 - [info] binlog_do_db= , binlog_ignore_db= Thu May 26 22:52:31 2016 - [info] Replication filtering check ok. Thu May 26 22:52:31 2016 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking. Thu May 26 22:52:31 2016 - [info] HealthCheck: SSH to 192.168.1.23 is reachable. Thu May 26 22:52:31 2016 - [info] Binlog server 192.168.1.23 is reachable. Thu May 26 22:52:31 2016 - [info] Checking recovery script configurations on 192.168.1.23(192.168.1.23:3306).. Thu May 26 22:52:31 2016 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/mysql/backup/binlog --output_file=/etc/masterha/app1/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000004 Thu May 26 22:52:31 2016 - [info] Connecting to [email protected](192.168.1.23:22).. Creating /etc/masterha/app1 if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /mysql/backup/binlog, up to mysql-bin.000004 Thu May 26 22:52:31 2016 - [info] Binlog setting check done. Thu May 26 22:52:31 2016 - [info] Checking SSH publickey authentication settings on the current master.. Thu May 26 22:52:31 2016 - [info] HealthCheck: SSH to 192.168.1.21 is reachable. Thu May 26 22:52:31 2016 - [info] 192.168.1.21(192.168.1.21:3306) (current master) +--192.168.1.22(192.168.1.22:3306) +--192.168.1.23(192.168.1.23:3306) Thu May 26 22:52:31 2016 - [info] Checking replication health on 192.168.1.22.. Thu May 26 22:52:31 2016 - [info] ok. Thu May 26 22:52:31 2016 - [info] Checking replication health on 192.168.1.23.. Thu May 26 22:52:31 2016 - [info] ok. Thu May 26 22:52:31 2016 - [info] Checking master_ip_failover_script status: Thu May 26 22:52:31 2016 - [info] /etc/masterha/app1/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.1.21 --orig_master_ip=192.168.1.21 --orig_master_port=3306 IN SCRIPT TEST====/sbin/ifconfig eth1:1 down==/sbin/ifconfig eth1:1 192.168.1.20;/sbin/arping -I eth1 -c 3 -s 192.168.1.20 192.168.1.1 >/dev/null 2>&1=== Checking the Status of the script.. OK Thu May 26 22:52:34 2016 - [info] OK. Thu May 26 22:52:34 2016 - [warning] shutdown_script is not defined. Thu May 26 22:52:34 2016 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
MHA启动及关闭
nohup masterha_manager --conf=/etc/masterha/app1.cnf > /etc/masterha/app1/manager.log < /dev/null 2>&1 &
检查是否启动:
masterha_check_status --conf=/etc/masterha/app1.cnf app1 (pid:11447) is running(0:PING_OK), master:192.168.1.21
停止Mha:
masterha_stop --conf=/etc/masterha/app1.cnf Stopped app1 successfully. [3]+ Exit 1 nohup masterha_manager --conf=/etc/masterha/app1.cnf > /etc/masterha/app1/manager.log < /dev/null 2>&1
测试:
说明,每次测试完成后,需要清理/etc/masterha/app1下的日志,然后启动Mha manager.
1:关闭mysql1上的mysql,查看从库从那里同步,以及mha日志输出
2:恢复mysql1为mysql2的slave,change master语句可以在/etc/masterha/app1/manager.log里找到。
在配置GTID复制时候遇到 1032错误,用以下方法解决
mysql> show global variables like '%gtid%'; +---------------------------------+------------------------------------------------------------------------------------+ | Variable_name | Value | +---------------------------------+------------------------------------------------------------------------------------+ | binlog_gtid_simple_recovery | OFF | | enforce_gtid_consistency | ON | | gtid_executed | 88b05570-2599-11e6-880a-000c29c18cf5:1-3, 9ee7c7af-cbf3-11e5-bf75-000c2923e459:1-4 | | gtid_mode | ON | | gtid_owned | | | gtid_purged | | | simplified_binlog_gtid_recovery | OFF | +---------------------------------+------------------------------------------------------------------------------------+ stop slave; set gtid_next='9ee7c7af-cbf3-11e5-bf75-000c2923e459:4'; begin; commit; set gtid_next='automatic'; start slave; show slave status\G;