为什么linux的硬连接(hard link)不能指向目录

playlinuxxx

2013-07-22

This is just a bad idea, as there is no way to tell the difference between a hard link and original name.

Allowing hard links to directories would break the directed acyclic graph structure of the filesystem, possibly creating directory loops and dangling directory subtrees, which would make fsck and any other file tree walkers error prone.

First, to understand this, let's talk about inodes. The data in the filesystem is held in blocks on the disk, and those blocks are collected together by an inode. You can think of the inode as THE file. Inodes lack filenames though. That's where links come in.

A link is just a pointer to an inode. A directory is an inode that holds links. Each filename in a directory is just a link to an inode. Opening a file in UNIX also creates a link, but it's a different type of link (it's not a named link).

A hard link is just an extra directory entry pointing to that inode. When you ls -l, the number after the permissions is the named link count. Most regular files will have one link. Creating a new hard link to a file will make both filenames point to the same inode. Note:

% ls -l test
ls: test: No such file or directory
% touch test
% ls -l test
-rw-r--r--  1 danny  staff  0 Oct 13 17:58 test
% ln test test2
% ls -l test*
-rw-r--r--  2 danny  staff  0 Oct 13 17:58 test
-rw-r--r--  2 danny  staff  0 Oct 13 17:58 test2
% touch test3
% ls -l test*
-rw-r--r--  2 danny  staff  0 Oct 13 17:58 test
-rw-r--r--  2 danny  staff  0 Oct 13 17:58 test2
-rw-r--r--  1 danny  staff  0 Oct 13 17:59 test3
            ^
            ^ this is the link count

Now, you can clearly see that there is no such think as a hard link. A hard link is the same as a regular name. In the above example, test or test2, which is the original file and which is the hard link? By the end, you cant really tell (ignoring timestamps) because both names point to the same contents, the same inode:

% ls -li test*  
14445750 -rw-r--r--  2 danny  staff  0 Oct 13 17:58 test
14445750 -rw-r--r--  2 danny  staff  0 Oct 13 17:58 test2
14445892 -rw-r--r--  1 danny  staff  0 Oct 13 17:59 test3

The -i flag to ls shows you inode numbers in the beginning of the line. Note how test and test2 have the same inode number.

Now, if you were allowed to do this for directories, two different directories in different points in the filesystem could point to the same thing. In fact, a subdir could point back to its grandparent, creating a loop.

Why is this loop a concern? Because when you are traversing, there is no way to detect you are looping (without keeping track of inode numbers as you traverse). Imagine you are writing the du command, which needs to recurse through subdirs to find out about disk usage. How would du know when it hit a loop? It is error prone and a lot of bookkeeping that du would have to do, just to pull off this simple task.

Symlinks are a whole different beast, in that they are a special type of "file" that many file filesystem APIs tend to automatically follow. Note, a symlink can point to an nonexistent destination, because they point by name, and not directly to an inode. That concept doesn't make sense with hard links, because the mere existance of a "hard link" means the file exists.

So why can du deal with symlinks easily and not hard links? We were able to see above that hard links are indistinguishable from normal directory entries. Symlinks however are special, detectable, and skippable! Du notices that the symlink is a symlink, and skips it completely!

% ls -l 
total 4
drwxr-xr-x  3 danny  staff  102 Oct 13 18:14 test1/
lrwxr-xr-x  1 danny  staff    5 Oct 13 18:13 test2@ -> test1
% du -ah
242M    ./test1/bigfile
242M    ./test1
4.0K    ./test2
242M    .

With the exception of mount points, each directory has one and only parent: ...

One way to do pwd is to check the device:inode for '.' and '..'. If they are the same, you have reached the root of the file system. Otherwise, find the name of the current directory in the parent, push that on a stack, and start comparing '../.' with '../..', then '../../.' with '../../..', etc. Once you've hit the root, start popping and printing the names from the stack. This algorithm relies on the fact that each directory has one and only one parent.

If hard links to directories were allowed, which one of the multiple parents should .. point to? That is one compelling reason why hardlinks to directories are not allowed.

Symlinks to directories don't cause that problem. If a program wants to, it could do an lstat() on each part of the pathname and detect when a symlink is encountered. The pwd algorithm will return the true absolute pathname for a target directory. The fact that there is a piece of text somewhere (the symlink) that points to the target directory is pretty much irrelevant. The existence of such a symlink does not create a loop in the graph.

linux系统 test

安科网

为什么linux的硬连接(hard link)不能指向目录

playlinuxxx

playlinuxxx

相关推荐

如何对Linux ps命令输出进行排序

如何在Fedora中安装VirtualBox

一篇带给你Linux磁盘管理和Shell编程

Linux日志文件系统原来是这样工作的

Linux环境变量配置全攻略

自动解锁Linux上的加密磁盘

Linux安装Nginx步骤详解

Linux安装Nginx步骤详解

linux自动化交互脚本expect详解

Linux Shell 如何获取参数的方法

Linux Shell脚本中获取本机ip地址方法

Linux 中shell脚本设置开头固定格式的实现方法

浅析Linux之bash反弹shell原理

linux反弹shell的原理详解

Linux 通过 autojump 命令减少 cd 命令的使用的实现方法

Linux下redis5.0.5的安装过程与配置方法

Redis概述及linux安装redis的详细教程

linux 常见的标识与Redis数据库详解

Aliyun Linux 编译安装 php7.3 tengine2.3.2 mysql8.0 redis5的过程详解

linux mint下安装phpstorm2020包括JDK部分的教程详解

playlinuxxx