简体   繁体   English

对grep列出的*每个*项目执行操作

[英]perform an operation for *each* item listed by grep

How can I perform an operation for each item listed by grep individually? 如何分别对grep列出的每个项目执行操作?

Background: 背景:

I use grep to list all files containing a certain pattern: 我使用grep列出了包含特定模式的所有文件:

grep -l '<pattern>' directory/*.extension1

I want to delete all listed files but also all files having the same file name but a different extension: .extension2 . 我想删除所有列出的文件,还要删除所有具有相同文件名但扩展名不同的文件: .extension2

I tried using the pipe, but it seems to take the output of grep as a whole. 我尝试使用管道,但它似乎将grep的输出作为一个整体。

In find there is the -exec option, but grep has nothing like that. 在查找中有-exec选项,但是grep没有这样的东西。

If I understand your specification, you want: 如果我了解您的规格,则需要:

grep --null -l '<pattern>' directory/*.extension1 | \
    xargs -n 1 -0 -I{} bash -c 'rm "$1" "${1%.*}.extension2"' -- {}

This is essentially the same as what @triplee's comment describes, except that it's newline-safe. 这与@triplee的注释所描述的基本相同,只是它是换行安全的。

What's going on here? 这里发生了什么?

grep with --null will return output delimited with nulls instead of newline. --null grep将返回以null代替换行符的输出。 Since file names can have newlines in them delimiting with newline makes it impossible to parse the output of grep safely, but null is not a valid character in a file name and thus makes a nice delimiter. 由于文件名中可以​​包含换行符,因此用换行符定界将无法安全地解析grep的输出,但是null在文件名中不是有效字符,因此是一个很好的定界符。

xargs will take a stream of newline-delimited items and execute a given command, passing as many of those items (one as each parameter) to a given command (or to echo if no command is given). xargs将使用换行符分隔的项目流并执行给定的命令,将这些项目(每个参数一个)传递给给定的命令(如果没有给出命令,则echo )。 Thus if you said: 因此,如果您说:

printf 'one\ntwo three \nfour\n' | xargs echo

xargs would execute echo one 'two three' four . xargs将执行echo one 'two three' four This is not safe for file names because, again, file names might contain embedded newlines. 这对于文件名来说是不安全的,因为同样,文件名可能包含嵌入的换行符。

The -0 switch to xargs changes it from looking for a newline delimiter to a null delimiter. -0切换到xargs会将其从寻找换行符分隔符更改为空分隔符。 This makes it match the output we got from grep --null and makes it safe for processing a list of file names. 这使其与我们从grep --null获得的输出匹配,并使其可以安全地处理文件名列表。

Normally xargs simply appends the input to the end of a command. 通常, xargs只是将输入附加到命令的末尾。 The -I switch to xargs changes this to substitution the specified replacement string with the input. -I切换到xargs会将其更改为用输入替换指定的替换字符串。 To get the idea try this experiment: 要获得想法,请尝试以下实验:

printf 'one\ntwo three \nfour\n' | xargs -I{} echo foo {} bar

And note the difference from the earlier printf | xargs 并注意与早期的printf | xargs printf | xargs command. printf | xargs命令。

In the case of my solution the command I execute is bash , to which I pass -c . 对于我的解决方案,我执行的命令是bash ,我将-c传递给该命令。 The -c switch causes bash to execute the commands in the following argument (and then terminate) instead of starting an interactive shell. -c开关使bash在以下参数中执行命令(然后终止),而不是启动交互式shell。 The next block 'rm "$1" "${1%.*}.extension2"' is the first argument to -c and is the script which will be executed by bash . 下一个块'rm "$1" "${1%.*}.extension2"'-c的第一个参数,它是将由bash执行的脚本。 Any arguments following the script argument to -c are assigned as the arguments to the script. -c脚本参数之后的所有参数都将分配为脚本参数。 This, if I were to say: 如果我要说的话:

bash -c 'echo $0' "Hello, world"

Then Hello, world would be assigned to $0 (the first argument to the script) and inside the script I could echo it back. 然后, Hello, worldHello, world分配给$0 (脚本的第一个参数),然后在脚本中echo它。

Since $0 is normally reserved for the script name I pass a dummy value (in this case -- ) as the first argument and, then, in place of the second argument I write {} , which is the replacement string I specified for xargs . 由于通常为脚本名称保留$0 ,因此我将一个虚拟值(在本例中为-- )作为第一个参数传递,然后代替第二个参数,我写了{} ,这是我为xargs指定的替换字符串。 This will be replaced by xargs with each file name parsed from grep 's output before bash is executed. 在执行bash之前,此文件将由xargs替换,每个文件名均由grep的输出解析。

The mini shell script might look complicated but it's rather trivial. 迷你shell脚本可能看起来很复杂,但是却很琐碎。 First, the entire script is single-quoted to prevent the calling shell from interpreting it. 首先,整个脚本都用单引号引起来,以防止调用Shell对其进行解释。 Inside the script I invoke rm and pass it two file names to remove: the $1 argument, which was the file name passed when the replacement string was substituted above, and ${1%.*}.extension2 . 在脚本中,我调用rm并将其传递给它删除两个文件名: $1参数,即上面替换替换字符串时传递的文件名,以及${1%.*}.extension2 This latter is a parameter substitution on the $1 variable. 后者是$1变量上的参数替换。 The important part is %.* which says 重要的部分是%.* ,其中表示

  • % "Match from the end of the variable and remove the shortest string matching the pattern. % “从变量末尾开始匹配,并删除与模式匹配的最短字符串。
  • .* The pattern is a single period followed by anything. .*模式是单个句点,后跟任何东西。

This effectively strips the extension, if any, from the file name. 这样可以有效地从文件名中删除扩展名(如果有)。 You can observe the effect yourself: 您可以自己观察效果:

foo='my file.txt'
bar='this.is.a.file.txt'
baz='no extension'
printf '%s\n'"${foo%.*}" "${bar%.*}" "${baz%.*}"

Since the extension has been stripped I concatenate the desired alternate extension .extension2 to the stripped file name to obtain the alternate file name. 由于扩展名已被剥离,因此我将所需的替代扩展名.extension2连接到剥离的文件名,以获得替代文件​​名。

If this does what you want, pipe the output through /bin/sh. 如果这样做符合您的要求,则通过/ bin / sh传递输出。

grep -l 'RE' folder/*.ext1 | sed 's/\(.*\).ext1/rm "&" "\1.ext2"/'

Or if sed makes you itchy: 或者,如果sed让您发痒:

grep -l 'RE' folder/*.ext1 | while read file; do
  echo rm "$file" "${file%.ext1}.ext2"
done

Remove echo if the output looks like the commands you want to run. 如果输出看起来像您要运行的命令,请删除echo

But you can do this with find as well: 但是您也可以使用find来做到这一点:

find /path/to/start -name \*.ext1 -exec grep -q 'RE' {} \; -print | ...

where ... is either the sed script or the three lines from while to done . 其中...是sed脚本或从whiledone三行。

The idea here is that find will ... well, "find" things based on the qualifiers you give it -- namely, that things match the file glob "*.ext", AND that the result of the "exec" is successful. 这里的想法是, find会...根据给定的限定符“查找”事物,即,事物与文件glob“ * .ext”匹配,并且“ exec”的结果成功。 The -q tells grep to look for RE in {} (the file supplied by find ), and exit with a TRUE or FALSE without generating any of its own output. -q告诉grep在{} (由find提供的文件)中查找RE,然后以TRUE或FALSE退出​​而不生成其自身的任何输出。

The only real difference between doing this in find vs doing it with grep is that you get to use find's awesome collection of conditions to narrow down your search further if required. 在find和使用grep进行搜索之间唯一真正的区别是,可以根据需要使用find的出色条件集合进一步缩小搜索范围。 man find for details. man find详细信息。 By default, find will recurse into subdirectories. 默认情况下,find将递归到子目录中。

You can pipe the list to xargs: 您可以将列表通过管道传递给xargs:

grep -l '<pattern>' directory/*.extension1 | xargs rm

As for the second set of files with a different extension, I'd do this (as usual use xargs echo rm when testing to make a dry run; I haven't tested it, it may not work correctly with filenames with spaces in them): 至于第二组具有不同扩展名的文件,我会这样做(通常在进行xargs echo rm运行测试时使用xargs echo rm ;我尚未对其进行测试,它可能不适用于其中包含空格的文件名):

filelist=$(grep -l '<pattern>' directory/*.extension1)
echo $filelist | xargs rm
echo ${filelist//.extension1/.extension2} | xargs rm

将结果通过管道传递给xargs ,它将允许您为每个匹配项运行命令。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM