[英]What is the fastest way to get the list of files recursively contained in a directory?
I have a directory that contains millions of files spread out in a hierarchy of folders. 我有一个目录,其中包含分布在文件夹层次结构中的数百万个文件。 This directory is stored on a large remote NFS filesystem.
该目录存储在大型远程NFS文件系统中。 I'd like to retrieve the list of these files as fast as possible.
我想尽快检索这些文件的列表。
Is it possible to go faster than find . > list.txt
是否有可能比
find . > list.txt
更快find . > list.txt
find . > list.txt
? find . > list.txt
? What factors affect speed? 哪些因素影响速度? I'm using python, but any solution will go as long as it's fast .
我正在使用python,但只要速度快 ,任何解决方案都会使用。
On linux, this was the fastest for me. 在linux上,这对我来说是最快的。 Use (bash) globbing and printf like this:
像这样使用(bash)globbing和printf:
printf "%s\n" directory/**/file
printf "%s\x00" directory/**/filename-with-special-characters | xargs -0 command
Seems to be a lot faster than 似乎要快得多
find directory -name file
or 要么
ls -1R directory | grep file
or even, surprisingly, 甚至,令人惊讶的是,
ls directory/**/file
This was a local file system though: x86_64 system, ext4 filesystem on an SSD, in a directory structure of over 600,000 directories with multiple files in them. 这是一个本地文件系统:x86_64系统,SSD上的ext4文件系统,目录结构超过600,000个目录,其中包含多个文件。
Depending what do you want in the output. 根据您在输出中的要求。 I recommend using
我推荐使用
ls -R | grep ":$" | sed -e 's/:$//' -e 's/^/ /' -e 's/-/|/'
for the complete path to all the files recursively in the current directory. 用于当前目录中递归的所有文件的完整路径。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.