【源起Netty 正传】Netty Channel

huaye00

2019-06-26

Channel定位

注意：如无特别说明，文中的Channel都指的是Netty Channel（io.netty.channel）

一周时间的Channel家族学习，一度让我怀疑人生——研究这个方法有没有用？学习Netty是不是有点儿下了高速走乡间小路的意思？我为啥要读源码？
之所以产生这些疑问，除了我本身心理活动丰富以外，主要病因在于没搞清楚Channel在Netty体系中的定位。而没能清晰理解Netty的定位，也默默的送出了一记助攻。

作些本质思考：Netty是一个NIO框架，是一个嫁接在java NIO基础上的框架。

宏观上可以这么理解，见下图：

【源起Netty 正传】Netty Channel

先不急着聊Channel，回顾下IO演进过程，重点关注IO框架的结构变化。搞懂了这部分后，我们将明白Channel在IO世界中扮演的角色！

进击的IO

BIO

【源起Netty 正传】Netty Channel

此图展示的已经算是优化后的BIO了——用到了线程池。显然，每一个client都需要server端付出一个Thread的代价，即使你通过线程池做了优化，由于受到线程个数的制约，激增的客户端依旧表现的“欲求不满”。

NIO

【源起Netty 正传】Netty Channel

Acceptor注册Selector，监听accept事件
当客户端连接后，触发accept事件
服务器构建对应的Channel，并在其上注册Selector，监听读写事件
当发生读写事件后，进行相应的读写处理

Reactor单线程

【源起Netty 正传】Netty Channel

与NIO模型相似，当然也就有和NIO同样的问题：selector/reactor单个线程处理多个channel的各种操作，如果其中一个channel的事件处理延缓了，将影响其它channel。

Reactor多线程

将read/write这种io处理操作分隔出来，非io型操作（业务操作）配备以线程池，进化成reactor多线程模型：

【源起Netty 正传】Netty Channel

这样的架构，系统瓶颈转移至Reactor部分。而目前劳苦功高的Reactor作了两件事：
1.接收客户端链接请求
2.处理IO型读写操作

主从Reactor

将接收client链接的功能再次拆分出来：

【源起Netty 正传】Netty Channel

Netty恰恰就是主从Reactor模型的实践者，想想服务端创建时的代码：

EventLoopGroup bossGroup = new NioEventLoopGroup(1);
EventLoopGroup workerGroup = new NioEventLoopGroup();

ServerBootstrap b = new ServerBootstrap();
b.group(bossGroup, workerGroup)

...

从nio时代的模型图上开始出现channel（java channel），它的定位就是进行诸如connect、write、read、close等底层交互。概括一下，java channel是上承selector下连socket的存在。而netty channel，则把java channel当作了底层。

源码分析

类结构

清楚了Channel的定位，接下来对其常用api进行分析。

首先拍出类图：
【源起Netty 正传】Netty Channel

其实Channel内部还有一套体系，Unsafe家族：
【源起Netty 正传】Netty Channel

Unsafe是Channel的内置类（接口），与java channel交互的重任最终会落到Unsafe身上。

write方法

write只是将数据写入到了ChannelOutboundBuffer中，并没有真正的发送出去，到flush方法调用时，才写入到java channel中发送给对方。

下面列出AbstractChannel的write方法，值得关注的地方已打上中文注释：

@Override
public final void write(Object msg, ChannelPromise promise) {
    assertEventLoop();

    ChannelOutboundBuffer outboundBuffer = this.outboundBuffer;
    if (outboundBuffer == null) {
        // If the outboundBuffer is null we know the channel was closed and so
        // need to fail the future right away. If it is not null the handling of the rest
        // will be done in flush0()
        // See https://github.com/netty/netty/issues/2362
        safeSetFailure(promise, WRITE_CLOSED_CHANNEL_EXCEPTION);
        // release message now to prevent resource-leak
        ReferenceCountUtil.release(msg);
        return;
    }

    int size;
    try {
        msg = filterOutboundMessage(msg);   //作消息的包装，转换成ByteBuf等
        size = pipeline.estimatorHandle().size(msg);
        if (size < 0) {
            size = 0;
        }
    } catch (Throwable t) {
        safeSetFailure(promise, t);
        ReferenceCountUtil.release(msg);
        return;
    }

    outboundBuffer.addMessage(msg, size, promise);    //msg消息写入ChannelOutboundBuffer
}

上述代码最后一行，msg写入了ChannelOutboundBuffer的尾节点tailEntry，同时将unflushedEntry赋值暂存。代码展开如下：

public void addMessage(Object msg, int size, ChannelPromise promise) {
    Entry entry = Entry.newInstance(msg, size, total(msg), promise);
    if (tailEntry == null) {
        flushedEntry = null;
        tailEntry = entry;
    } else {
        Entry tail = tailEntry;
        tail.next = entry;
        tailEntry = entry;
    }
    if (unflushedEntry == null) {    //注释一、标记成“未刷新”的数据
        unflushedEntry = entry;
    }

    incrementPendingOutboundBytes(entry.pendingSize, false);
}

ChannelOutboundBuffer类

这里对ChannelOutboundBuffer类进行简单说明，按惯例先看类注释。

/**
 * (Transport implementors only) an internal data structure used by {@link AbstractChannel} to store its pending
 * outbound write requests.
 *
 *  省略...
 */

前文提到过，write方法将消息写到ChannelOutboundBuffer，算是数据暂存；之后的flush再将消息刷到java channel乃至客户端。

来张示意图，方便理解：
【源起Netty 正传】Netty Channel

图中列出的三个属性，在write->ChannelOutboundBuffer->flush的数据流转过程中比较关键。Entry是啥？ChannelOutboundBuffer的静态内部类，典型的链表结构数据：

static final class Entry {

        Entry next;
        
        // 省略...
}

write方法的最后部分（注释一位置）调用outboundBuffer.addMessage(msg, size, promise)，已将封装msg的Entry赋值给tailEntry和unflushedEntry；而flush方法，通过调用outboundBuffer.addFlush()（下文，注释二位置），将unflushedEntry间接赋值给了flushedEntry。

public void addFlush() {
    Entry entry = unflushedEntry;
    if (entry != null) {
        if (flushedEntry == null) {
            // there is no flushedEntry yet, so start with the entry
            flushedEntry = entry;
        }
        do {
            flushed ++;
            if (!entry.promise.setUncancellable()) {
                // Was cancelled so make sure we free up memory and notify about the freed bytes
                int pending = entry.cancel();
                decrementPendingOutboundBytes(pending, false, true);
            }
            entry = entry.next;
        } while (entry != null);

        // All flushed so reset unflushedEntry
        unflushedEntry = null;
    }
}

flush方法

直接从AbstractChannel的flush方法开始（若以Channel的flush为开端会经pipeline，将有很长调用链，省略）：

public final void flush() {
    assertEventLoop();

    ChannelOutboundBuffer outboundBuffer = this.outboundBuffer;
    if (outboundBuffer == null) {
        return;
    }

    outboundBuffer.addFlush();    //注释二、标记成“已刷新”数据
    flush0();    //数据处理
}

outboundBuffer.addFlush()方法已经分析过了，跟踪调用链flush0->doWrite，我们看下AbstractNioByteChannel的doWrite方法：

@Override
protected void doWrite(ChannelOutboundBuffer in) throws Exception {
    int writeSpinCount = config().getWriteSpinCount();    //自旋计数，限制循环次数，默认16
    do {
        Object msg = in.current();    //flushedEntry的msg
        if (msg == null) {
            // Wrote all messages.
            clearOpWrite();
            // Directly return here so incompleteWrite(...) is not called.
            return;
        }
        writeSpinCount -= doWriteInternal(in, msg);
    } while (writeSpinCount > 0);

    incompleteWrite(writeSpinCount < 0);
}

writeSpinCount是个自旋计数，类似于自旋锁的设定，防止当前IO线程由于网络等原因无尽执行写操作，而使得线程假死，造成资源浪费。

观察doWriteInternal方法，关键处依旧中文注释伺候：

private int doWriteInternal(ChannelOutboundBuffer in, Object msg) throws Exception {
    if (msg instanceof ByteBuf) {
        ByteBuf buf = (ByteBuf) msg;
        if (!buf.isReadable()) {    //writerIndex - readerIndex >0 ? true: flase
            in.remove();
            return 0;
        }

        final int localFlushedAmount = doWriteBytes(buf);   //返回实际写入到java channel的字节数
        if (localFlushedAmount > 0) {   //写入成功
            in.progress(localFlushedAmount);
            /**
             * 1.已经全部写完，执行in.remove()
             * 2.“写半包”场景，直接返回1。
             *   外层方法的自旋变量writeSpinCount递减成15，轮询再次执行本方法
             */
            if (!buf.isReadable()) {
                in.remove();
            }
            return 1;
        }
    } else if (msg instanceof FileRegion) {
    
        //“文件型”消息处理逻辑省略..
        
    } else {
        // Should not reach here.
        throw new Error();
    }
    return WRITE_STATUS_SNDBUF_FULL;    //发送缓冲区满，值=Integer.MAX_VALUE
}

回到doWrite方法，最后执行了incompleteWrite(writeSpinCount < 0)：

protected final void incompleteWrite(boolean setOpWrite) {
    // Did not write completely.
    if (setOpWrite) {
        setOpWrite();
    } else {
        // Schedule flush again later so other tasks can be picked up in the meantime
        Runnable flushTask = this.flushTask;
        if (flushTask == null) {
            flushTask = this.flushTask = new Runnable() {
                @Override
                public void run() {
                    flush();
                }
            };
        }
        eventLoop().execute(flushTask);
    }
}

这里的设定挺有意思：

如果 setOpWrite = writeSpinCount < 0 = true，即 doWriteInternal方法返回值 = WRITE_STATUS_SNDBUF_FULL（发送缓冲区满）时，设置写操作位：

protected final void setOpWrite() {
    final SelectionKey key = selectionKey();
    // Check first if the key is still valid as it may be canceled as part of the deregistration
    // from the EventLoop
    // See https://github.com/netty/netty/issues/2104
    if (!key.isValid()) {
        return;
    }
    final int interestOps = key.interestOps();
    if ((interestOps & SelectionKey.OP_WRITE) == 0) {
        key.interestOps(interestOps | SelectionKey.OP_WRITE);
    }
}

其实就是设置SelectionKey的OP_WRITE操作位，在selector/reactor下次轮询的时候，将再次执行写操作

如果 setOpWrite = writeSpinCount < 0 = false，即 doWriteInternal方法返回值 = 1,16次写半包仍旧没将消息发送出去，则通过定时器再次执行flush：

public Channel flush() {
    pipeline.flush();
    return this;
}

结论：前者由于发送缓冲区满，已无法写入数据，于是继希望于selector的下次轮询；后者则可能只是因为自旋次数少，引起的数据发送不完全，直接将任务再次放入pipeline，而无需等待selector。
这无疑是种优化，细节之处，功力尽显！

感谢

高性能Server---Reactor模型

netty

安科网

【源起Netty 正传】Netty Channel

huaye00

Channel定位

进击的IO

BIO

NIO

Reactor单线程

Reactor多线程

主从Reactor

源码分析

类结构

write方法

ChannelOutboundBuffer类

flush方法

感谢

huaye00

相关推荐

没吃透Netty 缓冲区，还能算得上Java老司机？

彻底搞懂 Netty 线程模型

netty教程

初识Netty：背景、现状与趋势

Netty源码学习系列之5-NioEventLoop的run方法

昨天，我彻底搞懂了Netty内存分配策略！

【从BIO到Netty】1.BIO存在的问题

Netty中的这些知识点，你需要知道！

你要的Netty常见面试题总结，我面试回来整理好了！

Netty（五）Netty 高性能之道

Netty

第一章：初识Netty：背景、现状与趋势 (7讲)

Netty中使用零拷贝方式写大数据

浅析 Netty 实现心跳机制与断线重连

Netty原理

Netty源码死磕二（Netty的启动流程）

简单的Java实现Netty进行通信

Netty的实现原理、特点与优势、以及适用场景

Netty TCP 粘包和拆包及解决方案

Netty 核心组件

huaye00