【Go】优雅的读取http请求或响应的数据

89590599

2019-06-30

原文链接：https://blog.thinkeridea.com/...

从 http.Request.Body 或 http.Response.Body 中读取数据方法或许很多，标准库中大多数使用 ioutil.ReadAll 方法一次读取所有数据，如果是 json 格式的数据还可以使用 json.NewDecoder 从 io.Reader 创建一个解析器，假使使用 pprof 来分析程序总是会发现 bytes.makeSlice 分配了大量内存，且总是排行第一，今天就这个问题来说一下如何高效优雅的读取 http 中的数据。

背景介绍

我们有许多 api 服务，全部采用 json 数据格式，请求体就是整个 json 字符串，当一个请求到服务端会经过一些业务处理，然后再请求后面更多的服务，所有的服务之间都用 http 协议来通信(啊，为啥不用 RPC，因为所有的服务都会对第三方开放，http + json 更好对接)，大多数请求数据大小在 1K~4K，响应的数据在 1K~8K，早期所有的服务都使用 ioutil.ReadAll 来读取数据，随着流量增加使用 pprof 来分析发现 bytes.makeSlice 总是排在第一，并且占用了整个程序 1/10 的内存分配，我决定针对这个问题进行优化，下面是整个优化过程的记录。

pprof 分析

这里使用 https://github.com/thinkeridea/go-extend/blob/master/exnet/exhttp/expprof/pprof.go 中的 api 来实现生产环境的 /debug/pprof 监测接口，没有使用标准库的 net/http/pprof 包因为会自动注册路由，且长期开放 api，这个包可以设定 api 是否开放，并在规定时间后自动关闭接口，避免存在工具嗅探。

服务部署上线稳定后(大约过了一天半)，通过 curl 下载 allocs 数据，然后使用下面的命令查看分析。

$ go tool pprof allocs
File: xxx
Type: alloc_space
Time: Jan 25, 2019 at 3:02pm (CST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 604.62GB, 44.50% of 1358.61GB total
Dropped 776 nodes (cum <= 6.79GB)
Showing top 10 nodes out of 155
      flat  flat%   sum%        cum   cum%
  111.40GB  8.20%  8.20%   111.40GB  8.20%  bytes.makeSlice
  107.72GB  7.93% 16.13%   107.72GB  7.93%  github.com/sirupsen/logrus.(*Entry).WithFields
   65.94GB  4.85% 20.98%    65.94GB  4.85%  strings.Replace
   54.10GB  3.98% 24.96%    56.03GB  4.12%  github.com/json-iterator/go.(*frozenConfig).Marshal
   47.54GB  3.50% 28.46%    47.54GB  3.50%  net/url.unescape
   47.11GB  3.47% 31.93%    48.16GB  3.55%  github.com/json-iterator/go.(*Iterator).readStringSlowPath
   46.63GB  3.43% 35.36%   103.04GB  7.58%  handlers.(*AdserviceHandler).returnAd
   42.43GB  3.12% 38.49%    84.62GB  6.23%  models.LogItemsToBytes
   42.22GB  3.11% 41.59%    42.22GB  3.11%  strings.Join
   39.52GB  2.91% 44.50%    87.06GB  6.41%  net/url.parseQuery

从结果中可以看出采集期间一共分配了 1358.61GB top 10 占用了 44.50% 其中 bytes.makeSlice 占了接近 1/10，那么看看都是谁在调用 bytes.makeSlice 吧。

(pprof) web bytes.makeSlice

【Go】优雅的读取http请求或响应的数据

从上图可以看出调用 bytes.makeSlice 的最终方法是 ioutil.ReadAll, (受篇幅影响就没有截取 ioutil.ReadAll 上面的方法了)，而 90% 都是 ioutil.ReadAll 读取 http 数据调用，找到地方先别急想优化方案，先看看为啥 ioutil.ReadAll 会导致这么多内存分配。

func readAll(r io.Reader, capacity int64) (b []byte, err error) {
    var buf bytes.Buffer
    // If the buffer overflows, we will get bytes.ErrTooLarge.
    // Return that as an error. Any other panic remains.
    defer func() {
        e := recover()
        if e == nil {
            return
        }
        if panicErr, ok := e.(error); ok && panicErr == bytes.ErrTooLarge {
            err = panicErr
        } else {
            panic(e)
        }
    }()
    if int64(int(capacity)) == capacity {
        buf.Grow(int(capacity))
    }
    _, err = buf.ReadFrom(r)
    return buf.Bytes(), err
}

func ReadAll(r io.Reader) ([]byte, error) {
    return readAll(r, bytes.MinRead)
}

以上是标准库 ioutil.ReadAll 的代码，每次会创建一个 var buf bytes.Buffer 并且初始化 buf.Grow(int(capacity)) 的大小为 bytes.MinRead, 这个值呢就是 512，按这个 buffer 的大小读取一次数据需要分配 2~16 次内存，天啊简直不能忍，我自己创建一个 buffer 好不好。

看一下火焰图

http请求

安科网

【Go】优雅的读取http请求或响应的数据

89590599

背景介绍

pprof 分析

89590599

相关推荐

详解golang开发中http请求redirect的问题

PHP http请求超时问题解决方案

Http协议：什么情况下发生了options请求？

HTTP协议发展历史

PHP利用curl发送HTTP请求的实例代码

怎么减少http请求次数

一个菜鸟前端的自我提升：有关http请求中的get和post请求

HTTP-报文结构

深入理解HTTP协议--乐字节java

golang 反向代理reverseproxy源码分析

http协议中各个响应状态返回值（200、400、404、500等）的含义

HTTP 返回值标准含义

HTTP 之 Authorization

odoo 多个数据库http请求指定数据库

http请求常见状态码

http请求和报文

HTTP 协议

HTTP 冷知识 | HTTP 请求中，空格应该被编码为 %20 还是 + ？

了解HTTP协议

JMeter 脚本请求错误 HTTP Status 415 的解决

89590599