简体   繁体   English

dirent.h 中的 C 函数 readdir() 会发生什么?

[英]What happens in C function readdir() from dirent.h?

I'm doing a school assignment and the task at hand is to count the files and folders recursively.我正在做一项学校作业,手头的任务是递归地计算文件和文件夹。 I use the readdir() function, it seems to iterate through the directory I gave it.我使用 readdir() 函数,它似乎遍历我给它的目录。

int listdir(const char *path) 
{
  struct dirent *entry;
  DIR *dp;

  dp = opendir(path);
  if (dp == NULL) 
  {
    perror("opendir");
    return -1;
  }

  while((entry = readdir(dp)))
    puts(entry->d_name);

  closedir(dp);
  return 0;
}

I want to see the "something++;"我想看“something++;” step of this function, there should be one, right?这个函数的步骤,应该有一个吧? All I can find is this line in glibc's dirent/dirent.h我只能在 glibc 的 dirent/dirent.h 中找到这一行

extern struct dirent *readdir (DIR *__dirp) __nonnull ((1)); 

and


struct dirent *
__readdir (DIR *dirp)
{
  __set_errno (ENOSYS);
  return NULL;
}
weak_alias (__readdir, readdir)

in dirent/readdir.c在目录/readdir.c

Where does the iteration happen?迭代发生在哪里?

Maybe a duplicate of How readdir function is working inside while loop in C?也许是How readdir function is working inside while loop in C 的副本?

I tried to grep through glibc source code for readdir - didn't find, searched the Internet - didn't find, although some say there is an obsolete linux system call also called readdir.我试图通过 glibc 源代码为 readdir grep - 没有找到,在互联网上搜索 - 没有找到,尽管有人说有一个过时的 linux 系统调用也称为 readdir。

There is also this还有这个

" The readdir() function returns a pointer to a dirent structure representing the next directory entry in the directory stream pointed to by dirp. It returns NULL on reaching the end of the directory stream or if an error occurred." “ readdir() 函数返回一个指向 dirent 结构的指针,该结构表示 dirp 指向的目录流中的下一个目录条目。它在到达目录流末尾或发生错误时返回 NULL。”

and this还有这个

" The order in which filenames are read by successive calls to readdir() depends on the filesystem implementation; it is unlikely that the names will be sorted in any fashion." “连续调用 readdir() 读取文件名的顺序取决于文件系统的实现;名称不太可能以任何方式排序。”

in man readdir.在人阅读目录中。

From this answer - https://stackoverflow.com/a/9344137/12847376 I assume OS can hijack functions with LD_PRELOAD, I see no such variable in my default shell.从这个答案 - https://stackoverflow.com/a/9344137/12847376我假设操作系统可以使用 LD_PRELOAD 劫持函数,我在默认 shell 中看不到这样的变量。 And too many hits in the Debian source search. Debian 源代码搜索中的点击率也很高。

I also grepped through the Linux kernel for LD_PRELOAD and readdir and got too many results on the syscall.我还在 Linux 内核中搜索 LD_PRELOAD 和 readdir,并在系统调用上得到了太多结果。

I'm not sure exactly what you are trying to accomplish.我不确定你到底想完成什么。 I have implemented something similar to this for another language's core library, so I can say there is not a ++something .我已经为另一种语言的核心库实现了类似的东西,所以我可以说没有++something The reason for that, is that the structures returned by the operating system do not have a consistent size.原因是操作系统返回的结构大小不一致。 The structure is something like the following:结构类似于以下内容:

struct dirent {
    long           d_ino;
    off_t          d_off;
    unsigned short d_reclen;
    char           d_type;
    char           d_name[];
};

You pass a buffer to the system call (I used getdents64 ), and it fills it in with a bunch of these dirent structures.您将一个缓冲区传递给系统调用(我使用getdents64 ),它用一堆这些 dirent 结构填充它。 That d_name[] does not have an officially known size. d_name[]没有官方已知的大小。 The size of the entire structure is defined by that d_reclen member of the struct.整个结构的大小由结构的d_reclen成员定义。

In memory, you could have many struct dirent like this:在内存中,你可以有很多这样的struct dirent

[0]                    [1]                                           [2]
44,0,24,DT_REG,"a.txt",41,0,47,DT_DIR,"a_really_long_directory_name",...

Here is a rough translation of how it works:这是它如何工作的粗略翻译:

uint8_t buf[BUFLEN];
long n = getdents64(dfd, buf, BUFLEN);
if (n < 0) {
    // error
}

// buf now holds dirent structs

struct dirent* d = buf;
int i = 0;
for (; i < res; i += d->d_reclen) { // <<<< this is the trick
     d = &buf[i];
     // do something with the d
}

Notice the way we increment i .注意我们递增i的方式。 Since the d_name member does not have an official size, we cannot just say struct dirent d[COUNT];由于d_name成员没有官方大小,我们不能只说struct dirent d[COUNT]; . . We don't know how big each struct will be.我们不知道每个结构有多大。

Where does the iteration happen?迭代发生在哪里?

On Linux, it happens here .在 Linux 上,它发生在这里 As you can see, the code repeatedly calls getdents (system call) to obtain a set of entries from the kernel, and "advances" the dp by updating dirp->offset , etc.如您所见,代码重复调用getdents (系统调用)以从内核获取一组条目,并通过更新dirp->offset等“推进” dp

  24 /* Read a directory entry from DIRP.  */
  25 struct dirent *
  26 __readdir_unlocked (DIR *dirp)
  27 {
  28   struct dirent *dp;
  29   int saved_errno = errno;
  30 
  31   if (dirp->offset >= dirp->size)
  32     {
  33       /* We've emptied out our buffer.  Refill it.  */
  34 
  35       size_t maxread = dirp->allocation;
  36       ssize_t bytes;
  37 
  38       bytes = __getdents (dirp->fd, dirp->data, maxread);
  39       if (bytes <= 0)
  40         {
  41           /* Linux may fail with ENOENT on some file systems if the
  42              directory inode is marked as dead (deleted).  POSIX
  43              treats this as a regular end-of-directory condition, so
  44              do not set errno in that case, to indicate success.  */
  45           if (bytes == 0 || errno == ENOENT)
  46             __set_errno (saved_errno);
  47           return NULL;
  48         }
  49       dirp->size = (size_t) bytes;
  50 
  51       /* Reset the offset into the buffer.  */
  52       dirp->offset = 0;
  53     }
  54 
  55   dp = (struct dirent *) &dirp->data[dirp->offset];
  56   dirp->offset += dp->d_reclen;
  57   dirp->filepos = dp->d_off;
  58 
  59   return dp;
  60 }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM