grep - 限制读取的文件数

Question

I have a directory with over 100,000 files. 我有一个超过100,000个文件的目录。 I want to know if the string "str1" exists as part of the content of any of these files. 我想知道字符串"str1"作为任何这些文件的内容的一部分存在。

The command: grep -l 'str1' * takes too long as it reads all of the files. 命令： grep -l 'str1' *读取所有文件时耗时太长。

How can I ask grep to stop reading any further files if it finds a match? 如果发现匹配，我如何要求grep停止阅读任何其他文件？ Any one-liner? 任何一个班轮？

Note: I have tried grep -l 'str1' * | head 注意：我试过grep -l 'str1' * | head grep -l 'str1' * | head but the command takes just as much time as the previous one. grep -l 'str1' * | head但该命令只需要尽可能多的时间前一个。

Answer 1

Naming 100,000 filenames in your command args is going to cause a problem. 在命令args中命名100,000个文件名会导致问题。 It probably exceeds the size of a shell command-line. 它可能超过了shell命令行的大小。

But you don't have to name all the files if you use the recursive option with just the name of the directory the files are in (which is . if you want to search files in the current directory): 但你没有，如果你使用递归的选项与目录中的文件是在短短的名字来命名的所有文件（这是.如果你要搜索当前目录下的文件）：

grep -l -r 'str1' . | head -1

Answer 2

Use grep -m 1 so that grep stops after finding the first match in a file. 使用grep -m 1使grep在找到文件中的第一个匹配项后停止。 It is extremely efficient for large text files. 它对于大型文本文件非常有效。

grep -m 1 str1 * /dev/null | head -1

If there is a single file, then /dev/null above ensures that grep does print out the file name in the output. 如果只有一个文件，则上面的/ dev / null确保grep确实在输出中打印出文件名。

If you want to stop after finding the first match in any file: 如果要在任何文件中找到第一个匹配后停止：

for file in *; do
  if grep -q -m 1 str1 "$file"; then
    echo "$file"
    break
  fi
done

The for loop also saves you from the too many arguments issue when you have a directory with a large number of files. 当你有一个包含大量文件的目录时， for循环还可以避免too many arguments问题。

grep - 限制读取的文件数

问题描述

2 个解决方案

解决方案1
5 已采纳 2016-12-30 20:38:31

解决方案2
3 2016-12-30 20:37:53

grep - 限制读取的文件数

问题描述

2 个解决方案

解决方案1 5 已采纳 2016-12-30 20:38:31

解决方案2 3 2016-12-30 20:37:53

解决方案1
5 已采纳 2016-12-30 20:38:31

解决方案2
3 2016-12-30 20:37:53