简体   繁体   English

计算目录(包括子目录)中文件数的最快方法

[英]The fastest way to count the number of files in a directory (including subdirectories)

I'm running a script that looks at all the files in a directory and its subdirectories. 我正在运行一个脚本来查看目录及其子目录中的所有文件。

The script has been running for a day, and I'd like to estimate how long it will keep running. 该脚本已经运行了一天,我想估计它将继续运行多长时间。 I know how many files it processed so far (73,000,000), but I don't know the total number of files. 我知道它到目前为止处理了多少文件(73,000,000),但我不知道文件的总数。

What is the fastest way to count the files? 计算文件的最快方法是什么?

I tried right-clicking on the directory and selecting "properties", and it's slowly counting up. 我试着右键单击目录并选择“属性”,然后它慢慢计数。 I tried redirecting ls into a file, and it's just churning & churning... 我尝试将ls重定向到一个文件中,它只是搅拌和搅拌......

Should I write a program in c? 我应该用c写一个程序吗?

The simplest way: 最简单的方法:

find <dir> -type f | wc -l

Slightly faster, perhaps: 或许快一点:

find <dir> -type f -printf '\n' | wc -l

I did a quick research. 我做了一个快速的研究。 Using a directory with 100,000 files I compared the following commands: 使用包含100,000个文件的目录,我比较了以下命令:

ls -R <dir>
ls -lR <dir>
find <dir> -type f

I ran them twice, once redirecting into a file ( >file ), and once piping into wc ( |wc -l ). 我跑了两次,一次重定向到一个文件( >file ),一次管道到wc( |wc -l )。 Here are the run times in seconds: 以下是以秒为单位的运行时间:

        >file   |wc
ls -R     14     14
find      89     56
ls -lR    91     82

The difference between >file and |wc -l is smaller than the difference between ls and find . >file|wc -l之间的差异小于lsfind之间的差异。

It appears that ls -R is at least 4x faster than find . 似乎ls -Rfind快至少4倍

Fastest I know about: 我知道的最快:

ls | wc -l

Note: keep in mind though that it lists all nodes inside a directory, including subdirectories and the two references to the current and the parent directory ( . & .. ). 注意:请记住,它列出了目录中的所有节点,包括子目录以及对当前和父目录( ... )的两个引用。

If you need the recursive count of files in all subdirectories (as opposed to everything including subdirectories inside the current directory), then you can add the "recursive" flag to the ls command: 如果您需要在所有子目录中递归计数文件(而不是当前目录中包含子目录的所有内容),那么您可以将“recursive”标志添加到ls命令:

ls -R | wc -l

If you compare this in speed to the suggestion using find you will see that it is much faster (factor 2 to 10), but keep in mind the note above. 如果你将速度与使用find的建议进行比较,你会看到它更快(因子2到10),但请记住上面的注释。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 计算目录中大量文件的最快/最简单的方法是什么(在 Linux 中)? - What is the fastest / easiest way to count large number of files in a directory (in Linux)? 如何计算父目录的特定子目录中的文件? - How to count files in specific subdirectories of a parent directory? 目录中的子目录数量? - Number of subdirectories in a directory? 获取递归包含在目录中的文件列表的最快方法是什么? - What is the fastest way to get the list of files recursively contained in a directory? 在 GCS 中获取文件数和文件夹总大小的最快方法? - Fastest way to get the files count and total size of a folder in GCS? 在 Python 中,在具有特定扩展名的目录中构建文件列表的最快方法 - In Python, fastest way to build a list of files in a directory with a certain extension 计算目录中所有gzip文件的字节数 - Count number of bytes of all gzip files in a directory 如何计算每个目录中的文件数? - How to count number of files in each directory? 计算给定目录中的文件数(具有任意文件名) - Count number of files (with arbitrary filenames) in a given directory 如何将目录中的大量zip文件移动到bash中指定数量的多个子目录中? - How do I move a large number of zip files in a directory to a specified number of multiple subdirectories in bash?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM