简体   繁体   English

为什么readdir()系统调用无法按应有的方式工作(意外的输出)?

[英]Why doesn't readdir () system call work the way it should (unexpected output)?

I am writing a C program like, 我正在写一个C程序,

void printdir (char*);

int main () {
    printf ("Directory scan of /home: \n");
    printdir ("/home/fahad/");
    exit (0);
}

void printdir (char *dir) {
    struct dirent *entry;
    DIR *dp = opendir (dir);

    if (dp == NULL) {
        fprintf (stderr, "Cannot open dir:%s\n", dir);
        return;
    }

    chdir (dir);
    while ((entry = readdir(dp)) != NULL)
        printf ("%s\n",entry -> d_name);
    closedir (dp);
}

Interestingly, it shows output in an unexpected way. 有趣的是,它以意外的方式显示输出。 Considering the fact that whenever a directory is created in UNIX . 考虑到在UNIX创建目录的事实。 First two entries are created inside this directory one is . 在此目录中创建前两个条目,一个是. and other is .. . 其他是.. So basically their inode numbers should be less than the directory entries created through mkdir () or open () (for directory and file respectively). 因此,基本上它们的inode编号应该小于通过mkdir ()open ()创建的目录条目mkdir ()分别用于目录和文件)。

My question is, in what order readdir () system call reads the directory entries? 我的问题是, readdir ()系统调用以什么顺序读取目录项? Because I don't get first who entries . 因为我没有谁先进入. and .. . .. Why is that so? 为什么会这样?

Try skipping the "." 尝试跳过“。” and ".." entries, as follows: 和“ ..”条目,如下所示:

DIR* dirp;
struct dirent  *dp=NULL;
char* fname;
if( !(dirp=opendir(dname)) ) {
    int ec=errno;
    printf("completed:-1:cannot opendir %s (%d)\n",dname,ec);
    return(-1);
}
while ((dp = readdir(dirp)) != NULL) {
    if( strcmp(dp->d_name,".")==0 ) continue;
    if( strcmp(dp->d_name,"..")==0 ) continue;
    fname=dp->d_name;
    sprintf(pathname,"%s/%s",dname,fname);
}

See this answer which notes that since the order is not stated as predictable, one should not assume any order. 请参阅此答案该答案指出,由于该订单不是可预测的,因此不应假定任何订单。 The above code will gives a sample of how to handle (avoid) these entries (in the typical use-case of traversing a directory hierarchy). 上面的代码将提供有关如何处理(避免)这些条目的示例(在遍历目录层次结构的典型用例中)。 The order is probably based upon the order of the files appearing in the directory inodes. 该顺序可能基于目录inode中出现的文件的顺序。

readdir() doesn't return entries in any particular order. readdir()不会以任何特定顺序返回条目。 As others mentioned, the order will depend on the particular file system in question. 如其他人所述,顺序将取决于所讨论的特定文件系统。

For example, the Berkeley UFS file system uses an unsorted linked-list. 例如,伯克利UFS文件系统使用未排序的链表。 See the description of the direct structure on page 744 of http://ptgmedia.pearsoncmg.com/images/0131482092/samplechapter/mcdougall_ch15.pdf . 请参阅http://ptgmedia.pearsoncmg.com/images/0131482092/samplechapter/mcdougall_ch15.pdf上第744页的direct结构说明。 The binary content of a directory consists of a stream of variable-length records, each of which contains the inode number, record length, string length (of the filename) and the string data itself. 目录的二进制内容由可变长度记录流组成,每个记录都包含索引节点号,记录长度,(文件名的)字符串长度和字符串数据本身。 readdir() works by walking the linked list (using the record length to know where each record begins relative to the previous record) and returning whatever it finds. readdir()工作方式是遍历链表(使用记录长度来了解每个记录相对于先前记录的起始位置)并返回找到的所有内容。

The list of records is not typically optimized, so filenames appear on the list (more or less) in the order the files were created. 记录列表通常没有经过优化,因此文件名按照创建文件的顺序出现在列表中(或多或少)。 But not quite, because holes (resulting from deleted files) will be filled with new filenames if they are small enough to fit. 但事实并非如此,因为如果小孔(由已删除的文件导致)足以容纳小孔,它们就会被新的文件名填充。

Now, not all file systems represent directories the way UFS does. 现在,并非所有文件系统都以UFS的方式表示目录。 A file system that keeps directory data in a binary tree may choose to implement readdir() as an in-order traversal of that tree, which would present files sorted by whatever attributes it uses as key for the tree. 将目录数据保存在二叉树中的文件系统可以选择将readdir()实现为对该树的有序遍历,这将显示按其用作该树的键的任何属性排序的文件。 Or it might use a pre-order traversal, which would not return the records in a sorted order. 否则,它可能会使用预先遍历,这不会按排序顺序返回记录。

Because applications can not know the nature of the file system's implementation (and that each mounted volume can potentially use a different file system), applications should never assume anything about the order of entries that readdir() returns. 由于应用程序无法知道文件系统的实施的性质(以及每个安装的卷有可能使用不同的文件系统),应用程序不应该承担有关的条目顺序任何 readdir()的回报。 If they require the entries to be sorted, they must read the entire directory into memory and do their own sorting. 如果他们要求对条目进行排序,则必须将整个目录读入内存并进行自己的排序。

This is why, for example, the ls command can take a long time to display output when run against a large directory. 例如,这就是为什么在大型目录上运行ls命令可能需要很长时间来显示输出。 It needs to sort the entire list of names (and determine the longest name, in order to compute the column width) before it can display any output. 它需要先对整个名称列表进行排序(并确定最长的名称,以便计算列宽),然后才能显示任何输出。 This is also why ls -1U (disable sorting and display in one column) will produce output immediately on such directories. 这也是ls -1U (禁用排序和在一列显示)将立即在此类目录上产生输出的原因。

First two entries are created inside this directory one is . 在此目录中创建前两个条目,一个是。 and other is ... So basically their inode numbers should be less than the directory entries created through mkdir () or open ()(for directory and file respectively). 另一个是...因此,基本上它们的inode编号应该小于通过mkdir()或open()(分别用于目录和文件)创建的目录条目。

Yes, your understanding about the inode numbers is correct. 是的,您对inode编号的理解是正确的。 To validate this we can write simple c++ program to store the inode/name in map. 为了验证这一点,我们可以编写简单的c ++程序将inode /名称存储在map中。

 std::map<ino_t, std::string>  entries;
 std::pair<ino_t, std::string> en;
 while ((entry = readdir(dp)) != NULL) {
          en.first = entry->d_ino;
          en.second = entry->d_name;
          entries.insert(en);
        printf ("%s\n",entry -> d_name);
  }


  "entries in GDB"
  ================
  [5114862] = "..",
  [5114987] = ".",
  [5115243] = "taop",
  [5115623] = "c++11_study",
  [5115651] = "volume-3",
  [5115884] = "gtkmm",
  [5116513] = "basic",
  [5116733] = "program",
  [5116794] = "bakwas",
  [5116813] = "a.out",
  [5116818] = "foo",

This way we can validate about the order of inode number and "." 这样,我们可以验证inode编号和“。”的顺序。 & ".." are the less than other directory & file entry. &“ ..”少于其他目录和文件条目。

My question is, in what order readdir () system call reads the directory entries? 我的问题是,readdir()系统调用以什么顺序读取目录项? Because I don't get first who entries . 因为我没有谁先进入。 and ... Why is that so? 而且...为什么会这样?

From The Book "Advanced Programming in the UNIX® Environment by W. Richard Stevens" , we can get the following: “ W. Richard Stevens撰写的“UNIX®环境中的高级编程一书中,我们可以获得以下内容:

The opendir function initializes things so that the first readdir reads the first entry in the directory. opendir函数初始化事物,以便第一个readdir读取目录中的第一个条目。 The ordering of entries within the directory is implementation dependent and is usually not alphabetical. 目录中条目的顺序取决于实现,并且通常不是字母顺序的。 So their order are not defined and for the above program, readdir() gave in the following order. 因此没有定义它们的顺序,对于上面的程序,readdir()的顺序如下。

Output from readdir()
=====================
c++11_study
taop
volume-3
basic
.
gtkmm
foo
program
a.out
..
bakwas

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM