Linux 内核里的数据结构——双向链表
双向链表
Linux 内核中自己实现了双向链表,可以在 include/linux/list.h 找到定义。我们将会首先从双向链表数据结构开始介绍内核里的数据结构。为什么?因为它在内核里使用的很广泛,你只需要在 free-electrons.com 检索一下就知道了。
首先让我们看一下在 include/linux/types.h 里的主结构体:
<span class="kwd">struct</span><span class="pln"> list_head </span><span class="pun">{</span>
<span class="kwd">struct</span><span class="pln"> list_head </span><span class="pun">*</span><span class="kwd">next</span><span class="pun">,</span><span class="pun">*</span><span class="pln">prev</span><span class="pun">;</span>
<span class="pun">};</span>
你可能注意到这和你以前见过的双向链表的实现方法是不同的。举个例子来说,在 glib 库里是这样实现的:
<span class="kwd">struct</span><span class="typ">GList</span><span class="pun">{</span>
<span class="pln">gpointer data</span><span class="pun">;</span>
<span class="typ">GList</span><span class="pun">*</span><span class="kwd">next</span><span class="pun">;</span>
<span class="typ">GList</span><span class="pun">*</span><span class="pln">prev</span><span class="pun">;</span>
<span class="pun">};</span>
通常来说一个链表结构会包含一个指向某个项目的指针。但是 Linux 内核中的链表实现并没有这样做。所以问题来了:链表在哪里保存数据呢?。实际上,内核里实现的链表是侵入式链表(Intrusive list)。侵入式链表并不在节点内保存数据-它的节点仅仅包含指向前后节点的指针,以及指向链表节点数据部分的指针——数据就是这样附加在链表上的。这就使得这个数据结构是通用的,使用起来就不需要考虑节点数据的类型了。
比如:
<span class="kwd">struct</span><span class="pln"> nmi_desc </span><span class="pun">{</span>
<span class="typ">spinlock_t</span><span class="pln"> lock</span><span class="pun">;</span>
<span class="kwd">struct</span><span class="pln"> list_head </span><span class="kwd">head</span><span class="pun">;</span>
<span class="pun">};</span>
让我们看几个例子来理解一下在内核里是如何使用 list_head
的。如上所述,在内核里有很多很多不同的地方都用到了链表。我们来看一个在杂项字符驱动里面的使用的例子。在 drivers/char/misc.c 的杂项字符驱动 API 被用来编写处理小型硬件或虚拟设备的小驱动。这些驱动共享相同的主设备号:
<span class="com">#</span><span class="kwd">define</span><span class="pln"> MISC_MAJOR </span><span class="lit">10</span>
但是都有各自不同的次设备号。比如:
<span class="kwd">ls</span><span class="pun">-</span><span class="pln">l </span><span class="pun">/</span><span class="pln">dev </span><span class="pun">|</span><span class="kwd">grep</span><span class="lit">10</span>
<span class="pln">crw</span><span class="pun">-------</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">235</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> autofs</span>
<span class="pln">drwxr</span><span class="pun">-</span><span class="pln">xr</span><span class="pun">-</span><span class="pln">x </span><span class="lit">10</span><span class="pln"> root root </span><span class="lit">200</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> cpu</span>
<span class="pln">crw</span><span class="pun">-------</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">62</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> cpu_dma_latency</span>
<span class="pln">crw</span><span class="pun">-------</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">203</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> cuse</span>
<span class="pln">drwxr</span><span class="pun">-</span><span class="pln">xr</span><span class="pun">-</span><span class="pln">x </span><span class="lit">2</span><span class="pln"> root root </span><span class="lit">100</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> dri</span>
<span class="pln">crw</span><span class="pun">-</span><span class="pln">rw</span><span class="pun">-</span><span class="pln">rw</span><span class="pun">-</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">229</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> fuse</span>
<span class="pln">crw</span><span class="pun">-------</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">228</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> hpet</span>
<span class="pln">crw</span><span class="pun">-------</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">183</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> hwrng</span>
<span class="pln">crw</span><span class="pun">-</span><span class="pln">rw</span><span class="pun">----+</span><span class="lit">1</span><span class="pln"> root kvm </span><span class="lit">10</span><span class="pun">,</span><span class="lit">232</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> kvm</span>
<span class="pln">crw</span><span class="pun">-</span><span class="pln">rw</span><span class="pun">----</span><span class="lit">1</span><span class="pln"> root disk </span><span class="lit">10</span><span class="pun">,</span><span class="lit">237</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> loop</span><span class="pun">-</span><span class="pln">control</span>
<span class="pln">crw</span><span class="pun">-------</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">227</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> mcelog</span>
<span class="pln">crw</span><span class="pun">-------</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">59</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> memory_bandwidth</span>
<span class="pln">crw</span><span class="pun">-------</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">61</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> network_latency</span>
<span class="pln">crw</span><span class="pun">-------</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">60</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> network_throughput</span>
<span class="pln">crw</span><span class="pun">-</span><span class="pln">r</span><span class="pun">-----</span><span class="lit">1</span><span class="pln"> root kmem </span><span class="lit">10</span><span class="pun">,</span><span class="lit">144</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> nvram</span>
<span class="pln">brw</span><span class="pun">-</span><span class="pln">rw</span><span class="pun">----</span><span class="lit">1</span><span class="pln"> root disk </span><span class="lit">1</span><span class="pun">,</span><span class="lit">10</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> ram10</span>
<span class="pln">crw</span><span class="pun">--</span><span class="kwd">w</span><span class="pun">----</span><span class="lit">1</span><span class="pln"> root </span><span class="kwd">tty</span><span class="lit">4</span><span class="pun">,</span><span class="lit">10</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> tty10</span>
<span class="pln">crw</span><span class="pun">-</span><span class="pln">rw</span><span class="pun">----</span><span class="lit">1</span><span class="pln"> root dialout </span><span class="lit">4</span><span class="pun">,</span><span class="lit">74</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> ttyS10</span>
<span class="pln">crw</span><span class="pun">-------</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">63</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> vga_arbiter</span>
<span class="pln">crw</span><span class="pun">-------</span><span class="lit">1</span><span class="pln"> root root </span><span class="lit">10</span><span class="pun">,</span><span class="lit">137</span><span class="typ">Mar</span><span class="lit">21</span><span class="lit">12</span><span class="pun">:</span><span class="lit">01</span><span class="pln"> vhci</span>
现在让我们看看它是如何使用链表的。首先看一下结构体 miscdevice
:
<span class="kwd">struct</span><span class="pln"> miscdevice</span>
<span class="pun">{</span>
<span class="kwd">int</span><span class="pln"> minor</span><span class="pun">;</span>
<span class="kwd">const</span><span class="kwd">char</span><span class="pun">*</span><span class="pln">name</span><span class="pun">;</span>
<span class="kwd">const</span><span class="kwd">struct</span><span class="pln"> file_operations </span><span class="pun">*</span><span class="pln">fops</span><span class="pun">;</span>
<span class="kwd">struct</span><span class="pln"> list_head </span><span class="kwd">list</span><span class="pun">;</span>
<span class="kwd">struct</span><span class="pln"> device </span><span class="pun">*</span><span class="pln">parent</span><span class="pun">;</span>
<span class="kwd">struct</span><span class="pln"> device </span><span class="pun">*</span><span class="pln">this_device</span><span class="pun">;</span>
<span class="kwd">const</span><span class="kwd">char</span><span class="pun">*</span><span class="pln">nodename</span><span class="pun">;</span>
<span class="typ">mode_t</span><span class="pln"> mode</span><span class="pun">;</span>
<span class="pun">};</span>
可以看到结构体miscdevice
的第四个变量list
是所有注册过的设备的链表。在源代码文件的开始可以看到这个链表的定义:
<span class="kwd">static</span><span class="pln"> LIST_HEAD</span><span class="pun">(</span><span class="pln">misc_list</span><span class="pun">);</span>
它实际上是对用list_head
类型定义的变量的扩展。
<span class="com">#</span><span class="kwd">define</span><span class="pln"> LIST_HEAD</span><span class="pun">(</span><span class="pln">name</span><span class="pun">)</span><span class="pln"> \</span>
<span class="kwd">struct</span><span class="pln"> list_head name </span><span class="pun">=</span><span class="pln"> LIST_HEAD_INIT</span><span class="pun">(</span><span class="pln">name</span><span class="pun">)</span>
然后使用宏 LIST_HEAD_INIT
进行初始化,这会使用变量name
的地址来填充prev
和next
结构体的两个变量。
<span class="com">#</span><span class="kwd">define</span><span class="pln"> LIST_HEAD_INIT</span><span class="pun">(</span><span class="pln">name</span><span class="pun">)</span><span class="pun">{</span><span class="pun">&(</span><span class="pln">name</span><span class="pun">),</span><span class="pun">&(</span><span class="pln">name</span><span class="pun">)</span><span class="pun">}</span>
现在来看看注册杂项设备的函数misc_register
。它在一开始就用函数 INIT_LIST_HEAD
初始化了miscdevice->list
。
<span class="pln">INIT_LIST_HEAD</span><span class="pun">(&</span><span class="pln">misc</span><span class="pun">-></span><span class="kwd">list</span><span class="pun">);</span>
作用和宏LIST_HEAD_INIT
一样。
<span class="kwd">static</span><span class="kwd">inline</span><span class="kwd">void</span><span class="pln"> INIT_LIST_HEAD</span><span class="pun">(</span><span class="kwd">struct</span><span class="pln"> list_head </span><span class="pun">*</span><span class="kwd">list</span><span class="pun">)</span>
<span class="pun">{</span>
<span class="kwd">list</span><span class="pun">-></span><span class="kwd">next</span><span class="pun">=</span><span class="kwd">list</span><span class="pun">;</span>
<span class="kwd">list</span><span class="pun">-></span><span class="pln">prev </span><span class="pun">=</span><span class="kwd">list</span><span class="pun">;</span>
<span class="pun">}</span>
接下来,在函数device_create
创建了设备后,我们就用下面的语句将设备添加到设备链表:
<span class="pln">list_add</span><span class="pun">(&</span><span class="pln">misc</span><span class="pun">-></span><span class="kwd">list</span><span class="pun">,</span><span class="pun">&</span><span class="pln">misc_list</span><span class="pun">);</span>
内核文件list.h
提供了向链表添加新项的 API 接口。我们来看看它的实现:
<span class="kwd">static</span><span class="kwd">inline</span><span class="kwd">void</span><span class="pln"> list_add</span><span class="pun">(</span><span class="kwd">struct</span><span class="pln"> list_head </span><span class="pun">*</span><span class="kwd">new</span><span class="pun">,</span><span class="kwd">struct</span><span class="pln"> list_head </span><span class="pun">*</span><span class="kwd">head</span><span class="pun">)</span>
<span class="pun">{</span>
<span class="pln">__list_add</span><span class="pun">(</span><span class="kwd">new</span><span class="pun">,</span><span class="kwd">head</span><span class="pun">,</span><span class="kwd">head</span><span class="pun">-></span><span class="kwd">next</span><span class="pun">);</span>
<span class="pun">}</span>
实际上就是使用3个指定的参数来调用了内部函数__list_add
:
- new - 新项。
- head - 新项将会插在
head
的后面 - head->next - 插入前,
head
后面的项。
__list_add
的实现非常简单:
<span class="kwd">static</span><span class="kwd">inline</span><span class="kwd">void</span><span class="pln"> __list_add</span><span class="pun">(</span><span class="kwd">struct</span><span class="pln"> list_head </span><span class="pun">*</span><span class="kwd">new</span><span class="pun">,</span>
<span class="kwd">struct</span><span class="pln"> list_head </span><span class="pun">*</span><span class="pln">prev</span><span class="pun">,</span>
<span class="kwd">struct</span><span class="pln"> list_head </span><span class="pun">*</span><span class="kwd">next</span><span class="pun">)</span>
<span class="pun">{</span>
<span class="kwd">next</span><span class="pun">-></span><span class="pln">prev </span><span class="pun">=</span><span class="kwd">new</span><span class="pun">;</span>
<span class="kwd">new</span><span class="pun">-></span><span class="kwd">next</span><span class="pun">=</span><span class="kwd">next</span><span class="pun">;</span>
<span class="kwd">new</span><span class="pun">-></span><span class="pln">prev </span><span class="pun">=</span><span class="pln"> prev</span><span class="pun">;</span>
<span class="pln">prev</span><span class="pun">-></span><span class="kwd">next</span><span class="pun">=</span><span class="kwd">new</span><span class="pun">;</span>
<span class="pun">}</span>
这里,我们在prev
和next
之间添加了一个新项。所以我们开始时用宏LIST_HEAD_INIT
定义的misc
链表会包含指向miscdevice->list
的向前指针和向后指针。
这儿还有一个问题:如何得到列表的内容呢?这里有一个特殊的宏:
<span class="com">#</span><span class="kwd">define</span><span class="pln"> list_entry</span><span class="pun">(</span><span class="pln">ptr</span><span class="pun">,</span><span class="pln"> type</span><span class="pun">,</span><span class="pln"> member</span><span class="pun">)</span><span class="pln"> \</span>
<span class="pln">container_of</span><span class="pun">(</span><span class="pln">ptr</span><span class="pun">,</span><span class="pln"> type</span><span class="pun">,</span><span class="pln"> member</span><span class="pun">)</span>
使用了三个参数:
- ptr - 指向结构
list_head
的指针; - type - 结构体类型;
- member - 在结构体内类型为
list_head
的变量的名字;
比如说:
<span class="kwd">const</span><span class="kwd">struct</span><span class="pln"> miscdevice </span><span class="pun">*</span><span class="pln">p </span><span class="pun">=</span><span class="pln"> list_entry</span><span class="pun">(</span><span class="pln">v</span><span class="pun">,</span><span class="kwd">struct</span><span class="pln"> miscdevice</span><span class="pun">,</span><span class="kwd">list</span><span class="pun">)</span>
然后我们就可以使用p->minor
或者 p->name
来访问miscdevice
。让我们来看看list_entry
的实现:
<span class="com">#</span><span class="kwd">define</span><span class="pln"> list_entry</span><span class="pun">(</span><span class="pln">ptr</span><span class="pun">,</span><span class="pln"> type</span><span class="pun">,</span><span class="pln"> member</span><span class="pun">)</span><span class="pln"> \</span>
<span class="pln">container_of</span><span class="pun">(</span><span class="pln">ptr</span><span class="pun">,</span><span class="pln"> type</span><span class="pun">,</span><span class="pln"> member</span><span class="pun">)</span>
如我们所见,它仅仅使用相同的参数调用了宏container_of
。初看这个宏挺奇怪的:
<span class="com">#</span><span class="kwd">define</span><span class="pln"> container_of</span><span class="pun">(</span><span class="pln">ptr</span><span class="pun">,</span><span class="pln"> type</span><span class="pun">,</span><span class="pln"> member</span><span class="pun">)</span><span class="pun">({</span><span class="pln"> \</span>
<span class="kwd">const</span><span class="kwd">typeof</span><span class="pun">(</span><span class="pun">((</span><span class="pln">type </span><span class="pun">*)</span><span class="lit">0</span><span class="pun">)-></span><span class="pln">member </span><span class="pun">)</span><span class="pun">*</span><span class="pln">__mptr </span><span class="pun">=</span><span class="pun">(</span><span class="pln">ptr</span><span class="pun">);</span><span class="pln"> \</span>
<span class="pun">(</span><span class="pln">type </span><span class="pun">*)(</span><span class="pun">(</span><span class="kwd">char</span><span class="pun">*)</span><span class="pln">__mptr </span><span class="pun">-</span><span class="pln"> offsetof</span><span class="pun">(</span><span class="pln">type</span><span class="pun">,</span><span class="pln">member</span><span class="pun">)</span><span class="pun">);})</span>
首先你可以注意到花括号内包含两个表达式。编译器会执行花括号内的全部语句,然后返回最后的表达式的值。
举个例子来说:
<span class="com">#</span><span class="kwd">include</span><span class="pun"><</span><span class="pln">stdio</span><span class="pun">.</span><span class="pln">h</span><span class="pun">></span>
<span class="kwd">int</span><span class="pln"> main</span><span class="pun">()</span><span class="pun">{</span>
<span class="kwd">int</span><span class="pln"> i </span><span class="pun">=</span><span class="lit">0</span><span class="pun">;</span>
<span class="kwd">printf</span><span class="pun">(</span><span class="str">"i = %d\n"</span><span class="pun">,</span><span class="pun">({++</span><span class="pln">i</span><span class="pun">;</span><span class="pun">++</span><span class="pln">i</span><span class="pun">;}));</span>
<span class="kwd">return</span><span class="lit">0</span><span class="pun">;</span>
<span class="pun">}</span>
最终会打印出2
。
下一点就是typeof
,它也很简单。就如你从名字所理解的,它仅仅返回了给定变量的类型。当我第一次看到宏container_of
的实现时,让我觉得最奇怪的就是表达式((type *)0)
中的0。实际上这个指针巧妙的计算了从结构体特定变量的偏移,这里的0
刚好就是位宽里的零偏移。让我们看一个简单的例子:
<span class="com">#</span><span class="kwd">include</span><span class="pun"><</span><span class="pln">stdio</span><span class="pun">.</span><span class="pln">h</span><span class="pun">></span>
<span class="kwd">struct</span><span class="pln"> s </span><span class="pun">{</span>
<span class="kwd">int</span><span class="pln"> field1</span><span class="pun">;</span>
<span class="kwd">char</span><span class="pln"> field2</span><span class="pun">;</span>
<span class="kwd">char</span><span class="pln"> field3</span><span class="pun">;</span>
<span class="pun">};</span>
<span class="kwd">int</span><span class="pln"> main</span><span class="pun">()</span><span class="pun">{</span>
<span class="kwd">printf</span><span class="pun">(</span><span class="str">"%p\n"</span><span class="pun">,</span><span class="pun">&((</span><span class="kwd">struct</span><span class="pln"> s</span><span class="pun">*)</span><span class="lit">0</span><span class="pun">)-></span><span class="pln">field3</span><span class="pun">);</span>
<span class="kwd">return</span><span class="lit">0</span><span class="pun">;</span>
<span class="pun">}</span>
结果显示0x5
。
下一个宏offsetof
会计算从结构体起始地址到某个给定结构字段的偏移。它的实现和上面类似:
<span class="com">#</span><span class="kwd">define</span><span class="pln"> offsetof</span><span class="pun">(</span><span class="pln">TYPE</span><span class="pun">,</span><span class="pln"> MEMBER</span><span class="pun">)</span><span class="pun">((</span><span class="typ">size_t</span><span class="pun">)</span><span class="pun">&((</span><span class="pln">TYPE </span><span class="pun">*)</span><span class="lit">0</span><span class="pun">)-></span><span class="pln">MEMBER</span><span class="pun">)</span>
现在我们来总结一下宏container_of
。只需给定结构体中list_head
类型 字段的地址、名字和结构体容器的类型,它就可以返回结构体的起始地址。在宏定义的第一行,声明了一个指向结构体成员变量ptr
的指针__mptr
,并且把ptr
的地址赋给它。现在ptr
和__mptr
指向了同一个地址。从技术上讲我们并不需要这一行,但是它可以方便地进行类型检查。第一行保证了特定的结构体(参数type
)包含成员变量member
。第二行代码会用宏offsetof
计算成员变量相对于结构体起始地址的偏移,然后从结构体的地址减去这个偏移,最后就得到了结构体。
当然了list_add
和 list_entry
不是<linux/list.h>
提供的唯一功能。双向链表的实现还提供了如下API:
- list_add
- list_add_tail
- list_del
- list_replace
- list_move
- list_is_last
- list_empty
- list_cut_position
- list_splice
- list_for_each
- list_for_each_entry
等等很多其它API。
via: https://github.com/0xAX/linux-insides/blob/master/DataStructures/dlist.md