MySQL多源复制引起的内存泄漏
场景 :
MySQL-5.7, 所有的小版本(<=17), percona-mysql-5.7所有版本;开启多源复制的只读实例的内存无限增长, 直到触发系统的OOM Kill;
结论 :
mysql bug, 附上bug单链接: https://bugs.mysql.com/bug.php?id=85371
现象描述 :
内存监控如图
问题原因:
目前只能基于现象来分析;
开启binlog_rows_query_log_events之后, 启用多源复制的slave会出现内存泄漏;
表现为内存使用率不断增长: 占用内存的为slave_sql线程, 数据库事件为memory/sql/Log_event;
相关数据(来源于截图中的实例):
重启只读slave之后, 相关事件的内存使用:
申请了内存,但是没有释放过: COUNT_FREE, SUM_NUMBER_OF_BYTES_FREE为0
*************************** 2. row ***************************
THREAD_ID: 18189
EVENT_NAME: memory/sql/Log_event
COUNT_ALLOC: 521692
COUNT_FREE: 0
SUM_NUMBER_OF_BYTES_ALLOC: 117988604
SUM_NUMBER_OF_BYTES_FREE: 0
...
LOW_NUMBER_OF_BYTES_USED: 25286276
CURRENT_NUMBER_OF_BYTES_USED: 117988604
HIGH_NUMBER_OF_BYTES_USED: 117988604
*************************** 3. row ***************************
THREAD_ID: 18183
EVENT_NAME: memory/sql/Log_event
COUNT_ALLOC: 521426
COUNT_FREE: 0
SUM_NUMBER_OF_BYTES_ALLOC: 117732632
SUM_NUMBER_OF_BYTES_FREE: 0
...
LOW_NUMBER_OF_BYTES_USED: 25154914
CURRENT_NUMBER_OF_BYTES_USED: 117732632
HIGH_NUMBER_OF_BYTES_USED: 117732632
两小时以后:
*************************** 1. row ***************************
THREAD_ID: 18189
EVENT_NAME: memory/sql/Log_event
COUNT_ALLOC: 2297022
COUNT_FREE: 0
SUM_NUMBER_OF_BYTES_ALLOC: 525744164
SUM_NUMBER_OF_BYTES_FREE: 0
...
LOW_NUMBER_OF_BYTES_USED: 25286276
CURRENT_NUMBER_OF_BYTES_USED: 525744164
HIGH_NUMBER_OF_BYTES_USED: 525744164
*************************** 2. row ***************************
THREAD_ID: 18183
EVENT_NAME: memory/sql/Log_event
COUNT_ALLOC: 2296412
COUNT_FREE: 0
SUM_NUMBER_OF_BYTES_ALLOC: 524600639
SUM_NUMBER_OF_BYTES_FREE: 0
...
LOW_NUMBER_OF_BYTES_USED: 25154914
CURRENT_NUMBER_OF_BYTES_USED: 524600639
HIGH_NUMBER_OF_BYTES_USED: 524600639
event对应的线程:
*************************** 1. row ***************************
thd_id: 18183
conn_id: 18158
user: sql/slave_sql
command: Sleep
state: Slave has read all relay log; waiting for more updates
current_memory: 532.28 MiB
*************************** 2. row ***************************
thd_id: 18189
conn_id: 18164
user: sql/slave_sql
command: Sleep
state: Slave has read all relay log; waiting for more updates
current_memory: 533.50 MiB
2 rows in set (0.10 sec)
解决方案 :
关闭binlog_rows_query_log_events(默认就是关闭的),
实际上这个参数主要是控制binlog中是否记录原始SQL语句的, 主要是Debug用;
而平时用-vv来解析binlog以后, 本身也会注明row模式中的SQL语句, 可读性也还可以接受;
这个bug目前是S2(Serious)
关闭这个配置以后, 内存变化如上图的后半部分, 基本可以看到不再有明显的上升趋势;
需要注意的是, 并不一定就不再有内存泄漏的问题了, 希望官方早日修复~
PS: Null的测试继续拖, 写不动写不动写不动_(:з」∠)_