简体   繁体   English

递归查找与特定模式匹配的所有文件

[英]Recursively find all files that match a certain pattern

I need to find (or more specifically, count) all files that match this pattern: 我需要找到(或更具体地说,计数)与此模式匹配的所有文件:

*/foo/*.doc * /富/ *。doc的

Where the first wildcard asterisk includes a variable number of subdirectories. 第一个通配符星号包含可变数量的子目录。

With gnu find you can use regex, which (unlike -name ) match the entire path: 使用gnu find你可以使用正则表达式(不像-name )匹配整个路径:

find . -regex '.*/foo/[^/]*.doc'

To just count the number of files: 要计算文件数量:

find . -regex '.*/foo/[^/]*.doc' -printf '%i\n' | wc -l

(The %i format code causes find to print the inode number instead of the filename; unlike the filename, the inode number is guaranteed to not have characters like a newline, so counting is more reliable. Thanks to @tripleee for the suggestion.) %i格式代码导致find打印inode编号而不是文件名;与文件名不同,inode编号保证不具有换行符等字符,因此计数更可靠。感谢@tripleee提供的建议。)

I don't know if that will work on OSX, though. 但我不知道这是否适用于OSX。

how about: 怎么样:

find BASE_OF_SEARCH/*/foo -name \\*.doc -type f | wc -l

What this is doing: 这是做什么的:

  • start at directory BASE_OF_SEARCH/ 从目录BASE_OF_SEARCH开始/
  • look in all directories that have a directory foo 查看具有目录foo的所有目录
  • look for files named like *.doc 查找名为* .doc的文件
  • count the lines of the result (one per file) 计算结果的行数(每个文件一行)

The benefit of this method: 这种方法的好处:

  • not recursive nor iterative (no loops) 不是递归的也不是迭代的(没有循环)
  • it's easy to read, and if you include it in a script it's fairly easy to decipher (regex sometimes is not). 它很容易阅读,如果你把它包含在一个脚本中,它很容易破译(正则表达式有时不是)。

UPDATE: you want variable depth? 更新:你想要变深? ok: 好:

find BASE_OF_SEARCH -name \\*.doc -type f | grep foo | wc -l

  • start at directory BASE_OF_SEARCH 从目录BASE_OF_SEARCH开始
  • look for files named like *.doc 查找名为* .doc的文件
  • only show the lines of this result that include "foo" 只显示包含“foo”的结果行
  • count the lines of the result (one per file) 计算结果的行数(每个文件一行)

Optionally, you could filter out results that have "foo" in the filename, because this will show those too. 或者,您可以过滤掉文件名中包含“foo”的结果,因为这也会显示这些结果。

Untested, but try: 未经测试,但尝试:

find . -type d -name foo -print | while read d; do echo "$d/*.doc" ; done | wc -l

find all the "foo" directories (at varying depths) (this ignores symlinks, if that's part of the problem you can add them); 找到所有“foo”目录(在不同的深度)(这会忽略符号链接,如果这是你可以添加它们的问题的一部分); use shell globbing to find all the ".doc" files, then count them. 使用shell globbing查找所有“.doc”文件,然后计算它们。

基于此页面上其他页面上的答案,我设法将以下内容放在一起,其中在当前文件夹中执行搜索,在其下的所有其他文件中执行扩展名为pdf的所有文件,然后对包含test_text的文件进行过滤他们的头衔。

find . -name "*.pdf" | grep test_text | wc -l

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM