简体   繁体   English

在 Linux 系统调用 getdents64 中编写的 linux_dirent64 结构中,为什么 d_off 不是所有早期条目的 d_reclens 的总和?

[英]In the linux_dirent64 structs written in the Linux syscall getdents64, why is d_off not the sum of the d_reclens of all earlier entries?

According the man page of getdents :根据getdents 的手册页

d_off is the distance from the start of the directory to the start of the next linux_dirent . d_off是从目录开始到下一个linux_dirent开始的距离。 d_reclen is the size of this entire linux_dirent . d_reclen是整个linux_dirent的大小。

So I would expect that if the first entry has d_reclen n , its d_off would also be n (and for the i -th entry, d_off would be the sum of the d_reclen s of all entries from 0 to i , inclusive).所以我希望如果第一个条目有d_reclen n ,它的d_off也将是n (对于第i个条目, d_off将是从0i的所有条目的d_reclen的总和,包括在内)。

However, in that same man page, a nicely printed table with the entries of an example directory looks like this:然而,在同一个手册页中,一个包含示例目录条目的打印精美的表格如下所示:

 --------------- nread=120 --------------- inode# file type d_reclen d_off d_name 2 directory 16 12. 2 directory 16 24.. 11 directory 24 44 lost+found 12 regular 16 56 a 228929 directory 16 68 sub 16353 directory 16 80 sub2 130817 directory 16 4096 sub3

The d_off fields of the entries do not seem to follow the rule as I expected.条目的d_off字段似乎不符合我预期的规则。 If the first entry has size 16, surely the offset from the start to the second entry would be 16, but apparently it's actually 12.如果第一个条目的大小为 16,那么从开始到第二个条目的偏移量肯定是 16,但显然它实际上是 12。

So what don't I understand about the d_off field of linux_dirent64 ?那么我对 linux_dirent64 的d_off字段有什么不了解的linux_dirent64

It's explained vaguely in the manual page, but as you can probably see by compiling and running the example program, your assumption does not hold.它在手册页中的解释很模糊,但正如您通过编译和运行示例程序可能看到的那样,您的假设不成立。

The manual page for readdir(3) gives a bit more insight: readdir(3)的手册页提供了更多信息:

d_off  The value returned in d_off is the same as would be returned by
       calling telldir(3) at the current position in the directory
       stream.  Be aware that despite its type and name, the d_off field
       is seldom any kind of directory offset on modern filesystems.
       Applications should treat this field as an opaque value, making no
       assumptions about its contents; see also telldir(3).

The key part is "the d_off field is seldom any kind of directory offset on modern filesystems" .关键部分是d_off字段很少是现代文件系统上的任何类型的目录偏移量” The d_off field is a value for internal use by the underlying filesystem, and its meaning is implementation-specific. d_off字段是底层文件系统内部使用的值,其含义是特定于实现的。 It does not necessarily have any correlation with d_reclen , nor does it need to represent an actual "offset" in memory. Whatever software you write, you should not rely on the value of d_off and consider it like an opaque identifier.不一定d_reclen有任何关联,也不需要代表 memory 中的实际“偏移量”。无论您编写什么软件,您都不应该依赖d_off的值并将其视为不透明的标识符。

There may be filesystems where d_off corresponds to an actual offset in bytes between dirents, but this is in general not the case.可能存在文件系统,其中d_off对应于目录之间的实际字节偏移量,但通常情况并非如此。 The field is used more or less like a unique "counter" or "cookie" value to distinguish files inside a directory.该字段或多或少像一个独特的“计数器”或“cookie”值一样使用,以区分目录中的文件。

In fact, if you take a look at the values on a Btrfs filesystem, d_off seems to start at 1 for .事实上,如果您查看Btrfs文件系统上的值, d_off似乎从1开始. and 2 for .. , increasing by one for any following dirent , with the last one having d_off equal to INT32_MAX .2代表.. ,对于任何后续dirent增加 1,最后一个d_off等于INT32_MAX At least for a directory with fresh newly created files, things will change after deleting/moving/creating more files.至少对于包含新创建文件的目录,删除/移动/创建更多文件后情况会发生变化。

$ mkdir test
$ cd test
$ touch a b c d e f
$ ls -l
total 0
-rw-r----- 1 marco marco 0 gen 15 01:20 a
-rw-r----- 1 marco marco 0 gen 15 01:20 b
-rw-r----- 1 marco marco 0 gen 15 01:20 c
-rw-r----- 1 marco marco 0 gen 15 01:20 d
-rw-r----- 1 marco marco 0 gen 15 01:20 e
-rw-r----- 1 marco marco 0 gen 15 01:20 f

$ ../test_program
--------------- nread=192 ---------------
inode#    file type  d_reclen  d_off   d_name
46206659  directory    24          1  .
  214242  directory    24          2  ..
46206662  regular      24          3  a
46206663  regular      24          4  b
46206664  regular      24          5  c
46206665  regular      24          6  d
46206666  regular      24          7  e
46206667  regular      24 2147483647  f

This 2004 Sourceware bug report for Glibc by Dan Tsafrir also contains some insightful explanations about d_off , such as: Dan Tsafrir在 2004 年为 Glibc 编写的 Sourceware 错误报告还包含一些关于d_off的有见地的解释,例如:

  • In the implementation of getdents() , the d_off field (belonging to the linux kernel's dirent structure) is falsely assumed to contain the byte offset to the next dirent .getdents()的实现中, d_off字段(属于 linux 内核的dirent结构)被错误地假定为包含到下一个dirent字节偏移量。 Note that the linux manual of the readdir system-call states that d_off is the "offset to this dirent " while glibc's getdents treats it as the offset to the next dirent .请注意, readdir系统调用的 linux 手册指出d_off是“dirent的偏移量”,而 glibc 的getdents将其视为下一个dirent的偏移量。

  • In practice, both of the above are wrong/misleading.实际上,以上两种说法都是错误的/具有误导性的。 The d_off field may contain illegal negative values, 0 (should also never happen as the "next" dirent 's offset must always be bigger then 0), or positive values that are bigger than the size of the directory-file itself: d_off字段可能包含非法的负值 0(也永远不会发生,因为“下一个” dirent的偏移量必须始终大于 0),或者大于目录文件本身大小的正值:

    • We're not sure what the Linux kernel intended to place in this field, but our experience shows that on "real" file systems (that actually reside on some disk) the offset seems to be a simple (not necessarily continuous) counter: eg first entry may have d_off=1 , second: d_off=2 , third: d_off=4096 , fourth= d_off=4097 etc. We conjecture this is the serial of the dirent record within the directory (and so, this is indeed the "offset", but counted in records out of which some were already removed).我们不确定 Linux kernel 打算在此字段中放置什么,但我们的经验表明,在“真实”文件系统(实际上驻留在某些磁盘上)上,偏移量似乎是一个简单的(不一定是连续的)计数器:例如第一个条目可能有d_off=1 ,第二个: d_off=2 ,第三个: d_off=4096 ,第四个= d_off=4097等。我们推测这是目录中dirent记录的序列(因此,这确实是“偏移量” ",但计算在其中一些已被删除的记录中)。

    • For file systems that are maintained by the amd automounter (automount, directories) the d_off seems to be arbitrary (and may be negative, zero or beyond the scope of a 32bit integer).对于由 amd 自动挂载程序(自动挂载、目录)维护的文件系统, d_off似乎是任意的(可能为负数、零或超过 32 位整数的 scope)。 We conjecture the amd doesn't assign this field and the received values are simply garbage.我们推测 amd 没有分配这个字段,并且接收到的值只是垃圾。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM