Redis连接超时异常的处理方法

0、问题描述

使用Jedis连接redis进行数据查询操作,正常的代码运行没有问题,但是时不时会报出如下错误:

Exception in thread "main" redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out
 at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:202)
 at redis.clients.util.RedisInputStream.read(RedisInputStream.java:181)
 at redis.clients.jedis.Protocol.processBulkReply(Protocol.java:181)
 at redis.clients.jedis.Protocol.process(Protocol.java:155)
 at redis.clients.jedis.Protocol.processMultiBulkReply(Protocol.java:206)
 at redis.clients.jedis.Protocol.process(Protocol.java:157)
 at redis.clients.jedis.Protocol.processMultiBulkReply(Protocol.java:206)
 at redis.clients.jedis.Protocol.process(Protocol.java:157)
 at redis.clients.jedis.Protocol.read(Protocol.java:215)
 at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340)
 at redis.clients.jedis.Connection.getRawObjectMultiBulkReply(Connection.java:285)
 at redis.clients.jedis.Connection.getObjectMultiBulkReply(Connection.java:291)
 at redis.clients.jedis.BinaryJedis.hscan(BinaryJedis.java:3390)
 at com.ict.mcg.filter.DuplicateClueFilterV2.hscan(DuplicateClueFilterV2.java:867)
 at com.ict.mcg.filter.DuplicateClueFilterV2.collectRecentCluekeywords(DuplicateClueFilterV2.java:487)
 at com.ict.mcg.main.GetCluesMain.run_online(GetCluesMain.java:208)
 at com.ict.mcg.main.GetCluesMain.main(GetCluesMain.java:1685)
Caused by: java.net.SocketTimeoutException: Read timed out
 at java.net.SocketInputStream.socketRead0(Native Method)
 at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
 at java.net.SocketInputStream.read(SocketInputStream.java:171)
 at java.net.SocketInputStream.read(SocketInputStream.java:141)
 at java.net.SocketInputStream.read(SocketInputStream.java:127)
 at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:196)
 ... 16 more

究其原因,可以定位为java.net.SocketTimeoutException: Read timed out,即网络连接异常;

1、 可能的原因

1.1 服务器资源包括内存、磁盘、cpu等利用率高

经过查看redis部署机器的状态信息,发现整体机器运行状态良好

1.2 服务器设置防火墙,导致连接失败

因为正常的代码流程都可以跑通,所以防火墙设置没有问题;

1.3 redis配置文件bind监听host配置不当

redis的配置文件中bind对应host的配置如下:

# By default Redis listens for connections from all the network interfaces
# available on the server. It is possible to listen to just one or multiple
# interfaces using the "bind" configuration directive, followed by one or
# more IP addresses.
#
# Examples:
#
# bind 192.168.1.100 10.0.0.1
# bind 127.0.0.1

默认的bind绑定的host为0.0.0.0,即可以监听每一个可用的网络接口;相当于配置为:

bind 0.0.0.0

我们的配置文件也配置正常,而且正常的代码流程运行正常,也可以佐证这一点;

1.4 Jedis使用配置问题

目前Jedis的连接池配置如下:

private static JedisPool getPool() {
    if (pool == null) {
      JedisPoolConfig config = new JedisPoolConfig();
      //控制一个pool可分配多少个jedis实例,通过pool.getResource()来获取;
      //如果赋值为-1,则表示不限制;如果pool已经分配了maxActive个jedis实例,则此时pool的状态为exhausted(耗尽)。
      config.setMaxActive(10);
      //控制一个pool最多有多少个状态为idle(空闲的)的jedis实例。
      config.setMaxIdle(2);
      //表示当borrow(引入)一个jedis实例时,最大的等待时间,如果超过等待时间,则直接抛出JedisConnectionException;
      config.setMaxWait(1000 * 200000);
      //在borrow一个jedis实例时,是否提前进行validate操作;如果为true,则得到的jedis实例均是可用的;
      config.setTestOnBorrow(true);
      config.setTestOnReturn(true);

      //目前redis只有一个服务器
      pool = new JedisPool(config, "localhost", 6379);
    }
    return pool;
  }

  private static Jedis getJedis() {
    Jedis jedis = null;
    int count = 0;
    do {
      try {
        pool = getPool();
        jedis = pool.getResource();
      } catch(Exception e) {
//    System.out.println(e.getMessage());
        e.printStackTrace();
        pool.returnBrokenResource(jedis);
      }

      count++;
    } while (jedis==null && count < 3);

    return jedis;
  }

构建JedisPool的逻辑中,只是设置了config.setMaxWait(1000 * 200000);,这个是引入新的jedis实例的最大等待时间,并没有进行其他相关的连接超时的配置;于是查看JedisPool的源代码,发现如下:

public JedisPool(final Config poolConfig, final String host) {
    this(poolConfig, host, Protocol.DEFAULT_PORT, Protocol.DEFAULT_TIMEOUT, null, Protocol.DEFAULT_DATABASE);
  }

  public JedisPool(String host, int port) {
    this(new Config(), host, port, Protocol.DEFAULT_TIMEOUT, null, Protocol.DEFAULT_DATABASE);
  }

  public JedisPool(final String host) {
    this(host, Protocol.DEFAULT_PORT);
  }

  public JedisPool(final Config poolConfig, final String host, int port,
      int timeout, final String password) {
    this(poolConfig, host, port, timeout, password, Protocol.DEFAULT_DATABASE);
  }

  public JedisPool(final Config poolConfig, final String host, final int port) {
    this(poolConfig, host, port, Protocol.DEFAULT_TIMEOUT, null, Protocol.DEFAULT_DATABASE);
  }

  public JedisPool(final Config poolConfig, final String host, final int port, final int timeout) {
    this(poolConfig, host, port, timeout, null, Protocol.DEFAULT_DATABASE);
  }

  public JedisPool(final Config poolConfig, final String host, int port, int timeout, final String password,
          final int database) {
    super(poolConfig, new JedisFactory(host, port, timeout, password, database));
  }

由上述代码可以看到,JedisPool有多个重载的构造函数,并且构造函数中需要传入一个timeout参数作为连接的超时时间,如果没有传,则采用Protocol.DEFAULT_TIMEOUT作为默认的超时时间,继续跟踪源码:

public final class Protocol {

  public static final int DEFAULT_PORT = 6379;
  public static final int DEFAULT_TIMEOUT = 2000;
  public static final int DEFAULT_DATABASE = 0;

  public static final String CHARSET = "UTF-8";

  public static final byte DOLLAR_BYTE = '$';
  public static final byte ASTERISK_BYTE = '*';
  public static final byte PLUS_BYTE = '+';
  public static final byte MINUS_BYTE = '-';
  public static final byte COLON_BYTE = ':';

  private Protocol() {
 // this prevent the class from instantiation
  }

可以得出结论,默认JedisPool中连接的默认超时时间为2秒,而我们调用的JedisPool构造函数,恰恰采用的是这个配置,只要两秒钟没有连接成功,redis的连接就断开,从而报错,这在数据库请求并发量比较大的时候是有可能发生的,遂做如下更改,在创建JedisPool的时候,传入一个较大的超时时间:

pool = new JedisPool(config, ParamUtil.REDIS_ADDRESS[0], ParamUtil.REDIS_PORT, 1000 * 10);

2、总结

遇到问题还是多查,多看源码,多看源码中的配置,仔细一项一项地排查问题!

相关推荐