简体   繁体   English

使用 find 命令但排除两个目录中的文件

[英]Use find command but exclude files in two directories

I want to find files that end with _peaks.bed , but exclude files in the tmp and scripts folders.我想查找以_peaks.bed结尾的文件,但排除tmpscripts文件夹中的文件。

My command is like this:我的命令是这样的:

 find . -type f \( -name "*_peaks.bed" ! -name "*tmp*" ! -name "*scripts*" \)

But it didn't work.但它没有用。 The files in tmp and script folder will still be displayed. tmpscript文件夹中的文件仍将显示。

Does anyone have ideas about this?有人对此有想法吗?

Here's how you can specify that with find :以下是您如何使用find指定它的方法:

find . -type f -name "*_peaks.bed" ! -path "./tmp/*" ! -path "./scripts/*"

Explanation:解释:

  • find . - Start find from current working directory (recursively by default) - 从当前工作目录开始查找(默认递归)
  • -type f - Specify to find that you only want files in the results -type f - 指定find结果中只需要文件
  • -name "*_peaks.bed" - Look for files with the name ending in _peaks.bed -name "*_peaks.bed" - 查找名称以_peaks.bed结尾的_peaks.bed
  • ! -path "./tmp/*" ! -path "./tmp/*" - Exclude all results whose path starts with ./tmp/ ! -path "./tmp/*" - 排除所有路径以./tmp/开头的结果
  • ! -path "./scripts/*" ! -path "./scripts/*" - Also exclude all results whose path starts with ./scripts/ ! -path "./scripts/*" - 同时排除所有路径以./scripts/开头的结果

Testing the Solution:测试解决方案:

$ mkdir a b c d e
$ touch a/1 b/2 c/3 d/4 e/5 e/a e/b
$ find . -type f ! -path "./a/*" ! -path "./b/*"

./d/4
./c/3
./e/a
./e/b
./e/5

You were pretty close, the -name option only considers the basename, where as -path considers the entire path =)您非常接近, -name选项仅考虑基本名称,而-path考虑整个路径 =)

这是你可以做到的一种方法......

find . -type f -name "*_peaks.bed" | egrep -v "^(./tmp/|./scripts/)"

Use采用

find \( -path "./tmp" -o -path "./scripts" \) -prune -o  -name "*_peaks.bed" -print

or要么

find \( -path "./tmp" -o -path "./scripts" \) -prune -false -o  -name "*_peaks.bed"

or要么

find \( -path "./tmp" -path "./scripts" \) ! -prune -o  -name "*_peaks.bed"

The order is important.顺序很重要。 It evaluates from left to right.它从左到右评估。 Always begin with the path exclusion.总是从路径排除开始。

Explanation解释

Do not use -not (or ! ) to exclude whole directory.不要使用-not (或! )来排除整个目录。 Use -prune .使用-prune As explained in the manual:如手册中所述:

−prune    The primary shall always evaluate as  true;  it
          shall  cause  find  not  to descend the current
          pathname if it is a directory.  If  the  −depth
          primary  is specified, the −prune primary shall
          have no effect.

and in the GNU find manual:并在 GNU 查找手册中:

-path pattern
              [...]
              To ignore  a  whole
              directory  tree,  use  -prune rather than checking
              every file in the tree.

Indeed, if you use -not -path "./pathname" , find will evaluate the expression for each node under "./pathname" .实际上,如果您使用-not -path "./pathname" , find 将计算"./pathname"下每个节点的表达式。

find expressions are just condition evaluation. find 表达式只是条件评估。

  • \\( \\) - groups operation (you can use -path "./tmp" -prune -o -path "./scripts" -prune -o , but it is more verbose). \\( \\) - 分组操作(你可以使用-path "./tmp" -prune -o -path "./scripts" -prune -o ,但它更冗长)。
  • -path "./script" -prune - if -path returns true and is a directory, return true for that directory and do not descend into it. -path "./script" -prune - 如果-path返回 true 并且是一个目录,则为该目录返回 true 并且进入该目录。
  • -path "./script" ! -prune -path "./script" ! -prune - it evaluates as (-path "./script") AND (! -prune) . -path "./script" ! -prune - 它评估为(-path "./script") AND (! -prune) It revert the "always true" of prune to always false.它将 prune 的“始终为真”恢复为始终为假。 It avoids printing "./script" as a match.它避免将"./script"打印为匹配项。
  • -path "./script" -prune -false - since -prune always returns true, you can follow it with -false to do the same than ! -path "./script" -prune -false - 因为-prune总是返回 true,所以你可以用-false做同样的事情! . .
  • -o - OR operator. -o - 或运算符。 If no operator is specified between two expressions, it defaults to AND operator.如果两个表达式之间没有指定运算符,则默认为 AND 运算符。

Hence, \\( -path "./tmp" -o -path "./scripts" \\) -prune -o -name "*_peaks.bed" -print is expanded to:因此, \\( -path "./tmp" -o -path "./scripts" \\) -prune -o -name "*_peaks.bed" -print扩展为:

[ (-path "./tmp" OR -path "./script") AND -prune ] OR ( -name "*_peaks.bed" AND print )

The print is important here because without it is expanded to:打印在这里很重要,因为没有它会扩展为:

{ [ (-path "./tmp" OR -path "./script" )  AND -prune ]  OR (-name "*_peaks.bed" ) } AND print

-print is added by find - that is why most of the time, you do not need to add it in you expression. -print由 find 添加 - 这就是为什么大多数时候,您不需要在表达式中添加它。 And since -prune returns true, it will print "./script" and "./tmp".由于-prune返回 true,它将打印“./script”和“./tmp”。

It is not necessary in the others because we switched -prune to always return false.在其他情况下没有必要,因为我们将-prune切换为始终返回 false。

Hint: You can use find -D opt expr 2>&1 1>/dev/null to see how it is optimized and expanded,提示:您可以使用find -D opt expr 2>&1 1>/dev/null来查看它是如何优化和扩展的,
find -D search expr 2>&1 1>/dev/null to see which path is checked. find -D search expr 2>&1 1>/dev/null以查看检查了哪个路径。

for me, this solution didn't worked on a command exec with find, don't really know why, so my solution is对我来说,这个解决方案在带有 find 的命令 exec 上不起作用,真的不知道为什么,所以我的解决方案是

find . -type f -path "./a/*" -prune -o -path "./b/*" -prune -o -exec gzip -f -v {} \;

Explanation: same as sampson-chen one with the additions of说明:与 sampson-chen 相同,但添加了

-prune - ignore the proceding path of ... -prune - 忽略...的处理路径

-o - Then if no match print the results, (prune the directories and print the remaining results) -o - 如果没有匹配,则打印结果,(修剪目录并打印剩余的结果)

18:12 $ mkdir a b c d e
18:13 $ touch a/1 b/2 c/3 d/4 e/5 e/a e/b
18:13 $ find . -type f -path "./a/*" -prune -o -path "./b/*" -prune -o -exec gzip -f -v {} \;

gzip: . is a directory -- ignored
gzip: ./a is a directory -- ignored
gzip: ./b is a directory -- ignored
gzip: ./c is a directory -- ignored
./c/3:    0.0% -- replaced with ./c/3.gz
gzip: ./d is a directory -- ignored
./d/4:    0.0% -- replaced with ./d/4.gz
gzip: ./e is a directory -- ignored
./e/5:    0.0% -- replaced with ./e/5.gz
./e/a:    0.0% -- replaced with ./e/a.gz
./e/b:    0.0% -- replaced with ./e/b.gz

你可以试试下面的:

find ./ ! \( -path ./tmp -prune \) ! \( -path ./scripts -prune \) -type f -name '*_peaks.bed'

Try something like尝试类似的东西

find . \( -type f -name \*_peaks.bed -print \) -or \( -type d -and \( -name tmp -or -name scripts \) -and -prune \)

and don't be too surprised if I got it a bit wrong.如果我弄错了也不要太惊讶。 If the goal is an exec (instead of print), just substitute it in place.如果目标是 exec (而不是打印),只需将其替换到位。

With these explanations you meet your objective and many others .通过这些解释,您可以实现您的目标和许多其他目标。 Just join each part as you want to do.只需按照您的意愿加入每个部分。

MODEL模型

find ./\
 -iname "some_arg" -type f\ # File(s) that you want to find at any hierarchical level.
 ! -iname "some_arg" -type f\ # File(s) NOT to be found on any hirearchic level (exclude).
 ! -path "./file_name"\ # File(s) NOT to be found at this hirearchic level (exclude).
 ! -path "./folder_name/*"\ # Folder(s) NOT to be found on this Hirearchic level (exclude).
 -exec grep -IiFl 'text_content' -- {} \; # Text search in the content of the found file(s) being case insensitive ("-i") and excluding binaries ("-I").

EXAMPLE例子

find ./\
 -iname "*" -type f\
 ! -iname "*pyc" -type f\
 ! -path "./.gitignore"\
 ! -path "./build/*"\
 ! -path "./__pycache__/*"\
 ! -path "./.vscode/*"\
 ! -path "./.git/*"\
 -exec grep -IiFl 'title="Brazil - Country of the Future",' -- {} \;

Thanks!谢谢! 🤗🇧🇷 🤗🇧🇷

[ Ref(s).: https://unix.stackexchange.com/q/73938/61742 ] [参考文献: https://unix.stackexchange.com/q/73938/61742]


EXTRA:额外的:

You can use the commands above together with your favorite editor and analyze the contents of the files found, for example...您可以将上面的命令与您喜欢的编辑器一起使用并分析找到的文件的内容,例如...

vim -p $(find ./\
 -iname "*" -type f\
 ! -iname "*pyc" -type f\
 ! -path "./.gitignore"\
 ! -path "./build/*"\
 ! -path "./__pycache__/*"\
 ! -path "./.vscode/*"\
 ! -path "./.git/*"\
 -exec grep -IiFl 'title="Brazil - Country of the Future",' -- {} \;)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM