仅显示匹配的字符串 - grep

Question

I have two files. 我有两个文件。 File1 is as follows File1如下

Apple
Cat
Bat

File2 is as follows File2如下

I have an Apple
Batman returns
This is a test file.

Now I want to check which strings in first file are not present in the second file. 现在我想检查第一个文件中哪些字符串不在第二个文件中。 I can do a grep -f file1 file2 but thats giving me the matched lines in the second file. 我可以做一个grep -f file1 file2但是那给了我第二个文件中匹配的行。

Answer 1

To get the strings that are in the first file and also in the second file : 要获取第一个文件和第二个文件中的字符串：

grep -of file1 file2

The result (using the given example) will be: 结果（使用给定的示例）将是：

Apple
Bat

To get the strings that are in the first file but not in the second file , you could: 要获取第一个文件中但不在第二个文件中的字符串，您可以：

grep -of file1 file2 | cat - file1 | sort | uniq -u

Or even simpler (thanks to @triplee's comment): 甚至更简单（感谢@ triplee的评论）：

grep -of file1 file2 | grep -vxFf - file1

The result (using the given example) will be: 结果（使用给定的示例）将是：

Cat

From the grep man page: 从grep 手册页：

-o , --only-matching -o ， - 只匹配
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line. 仅打印匹配行的匹配（非空）部分，每个此类部分位于单独的输出行上。

From the uniq man page: 从uniq 手册页：

-u , --unique -u ， - 唯一
Only print unique lines 仅打印唯一的线条

Answer 2

If you want to show words from file1 that are not in file2, a dirty way is to loop through the words and grep silently. 如果你想显示file1中不在file2中的单词，那么一种肮脏的方式就是循环遍历单词和grep。 In case of not match, print the word: 如果不匹配，请打印单词：

while read word
do
    grep -q "$word" f2 || echo "$word"
done < f1

To match exact words, add -w : grep -wq ... 要匹配确切的单词，请添加-w ： grep -wq ...

Test 测试

$ while read word; do grep -q "$word" f2 || echo "$word"; done < f1
Cat
$ while read word; do grep -wq "$word" f2 || echo "$word"; done < f1
Cat
Bat

A better approach is to use awk: 更好的方法是使用awk：

$ awk 'FNR==NR {a[$1]; next} {for (i=1;i<=NF;i++) {if ($i in a) delete a[$i]}} END {for (i in a) print i}' f1 f2
Cat 
Bat

This stores the values in file1 into the array a[] . 这会将file1中的值存储到数组a[] 。 Then, it loops through all lines of file2 checking each single element. 然后，它循环遍历file2的所有行，检查每个元素。 If one of them matches a value in the array a[] , then this element is removed from the array. 如果其中一个匹配数组a[]中的值，则从数组中删除此元素。 Finally, in the END{} block prints the values that were not found. 最后，在END{}块中打印未找到的值。

仅显示匹配的字符串 - grep

问题描述

2 个解决方案

解决方案1
5 已采纳 2014-08-28 10:13:25

解决方案2
0 2014-08-28 10:19:18

Test 测试

仅显示匹配的字符串 - grep

问题描述

2 个解决方案

解决方案1 5 已采纳 2014-08-28 10:13:25

解决方案2 0 2014-08-28 10:19:18

Test 测试

解决方案1
5 已采纳 2014-08-28 10:13:25

解决方案2
0 2014-08-28 10:19:18