解决python3 pika之连接断开的问题
问题描述
在消费rabbitMQ队列时, 每次进入回调函数内需要进行一些比较耗时的操作;操作完成后给rabbitMQ server发送ack信号以dequeue本条消息。
问题就发生在发送ack操作时, 程序提示链接已被断开或socket error。
源码示例
#!/usr/bin #coding: utf-8 import pika import time USER = 'guest' PWD = 'guest' TEST_QUEUE = 'just4test' def callback(ch, method, properties, body): print(body) time.sleep(600) ch.basic_publish('', routing_key=TEST_QUEUE, body="fortest") ch.basic_ack(delivery_tag = method.delivery_tag) def test_main(): s_conn = pika.BlockingConnection( pika.ConnectionParameters('127.0.0.1', credentials=pika.PlainCredentials(USER, PWD))) chan = s_conn.channel() chan.queue_declare(queue=TEST_QUEUE) chan.basic_publish('', routing_key=TEST_QUEUE, body="fortest") chan.basic_consume(callback, queue=TEST_QUEUE) chan.start_consuming() if __name__ == "__main__": test_main()
运行一段时间后, 就会报错:
[ERROR][pika.adapters.base_connection][2017-08-18 12:33:49]Error event 25, None [CRITICAL][pika.adapters.base_connection][2017-08-18 12:33:49]Tried to handle an error where no error existed [ERROR][pika.adapters.base_connection][2017-08-18 12:33:49]Fatal Socket Error: BrokenPipeError(32, 'Broken pipe')
问题排查
猜测:pika客户端没有及时发送心跳,连接被server断开
一开始修改了heartbeat_interval参数值, 示例如下:
def test_main(): s_conn = pika.BlockingConnection( pika.ConnectionParameters('127.0.0.1', heartbeat_interval=10, socket_timeout=5, credentials=pika.PlainCredentials(USER, PWD))) # ....
修改后运行依然报错,后来想想应该单线程被一直占用,pika无法发送心跳;
于是又加了个心跳线程, 示例如下:
#!/usr/bin #coding: utf-8 import pika import time import logging import threading USER = 'guest' PWD = 'guest' TEST_QUEUE = 'just4test' class Heartbeat(threading.Thread): def __init__(self, connection): super(Heartbeat, self).__init__() self.lock = threading.Lock() self.connection = connection self.quitflag = False self.stopflag = True self.setDaemon(True) def run(self): while not self.quitflag: time.sleep(10) self.lock.acquire() if self.stopflag : self.lock.release() continue try: self.connection.process_data_events() except Exception as ex: logging.warn("Error format: %s"%(str(ex))) self.lock.release() return self.lock.release() def startHeartbeat(self): self.lock.acquire() if self.quitflag==True: self.lock.release() return self.stopflag=False self.lock.release() def callback(ch, method, properties, body): logging.info("recv_body:%s" % body) time.sleep(600) ch.basic_ack(delivery_tag = method.delivery_tag) def test_main(): s_conn = pika.BlockingConnection( pika.ConnectionParameters('127.0.0.1', heartbeat_interval=10, socket_timeout=5, credentials=pika.PlainCredentials(USER, PWD))) chan = s_conn.channel() chan.queue_declare(queue=TEST_QUEUE) chan.basic_consume(callback, queue=TEST_QUEUE) heartbeat = Heartbeat(s_conn) heartbeat.start() #开启心跳线程 heartbeat.startHeartbeat() chan.start_consuming() if __name__ == "__main__": test_main()
尝试运行,结果还是不行,不得不安静下来思考自己是不是想错了。
去看它的api,看到heartbeat_interval的解析:
:param int heartbeat_interval: How often to send heartbeats. Min between this value and server's proposal will be used. Use 0 to deactivate heartbeats and None to accept server's proposal.
按这样说法,应该还是没有把心跳值给设置好。上面的程序期望是10秒发一次心跳,但是理论上发送心跳的间隔会比10秒多一点。所以艾玛,我应该是把heartbeat_interval的作用搞错了, 它是指超过这个时间间隔不发心跳或不给server任何信息,server就会断开连接, 而不是说pika会按这个间隔来发心跳。 结果我把heartbeat_interval值设置高一点(比实际发送心跳/信息的间隔更长),比如上面设置成60秒,就正常运行了。
如果不指定heartbeat_interval, 它默认为None, 意味着按rabbitMQ server的配置来检测心跳是否正常。
如果设置heartbeat_interval=0, 意味着不检测心跳,server端将不会主动断开连接。