僅顯示匹配的字符串 - grep

Question

我有兩個文件。 File1如下

Apple
Cat
Bat

File2如下

I have an Apple
Batman returns
This is a test file.

現在我想檢查第一個文件中哪些字符串不在第二個文件中。 我可以做一個grep -f file1 file2但是那給了我第二個文件中匹配的行。

Answer 1

要獲取第一個文件和第二個文件中的字符串：

grep -of file1 file2

結果（使用給定的示例）將是：

Apple
Bat

要獲取第一個文件中但不在第二個文件中的字符串，您可以：

grep -of file1 file2 | cat - file1 | sort | uniq -u

甚至更簡單（感謝@ triplee的評論）：

grep -of file1 file2 | grep -vxFf - file1

結果（使用給定的示例）將是：

Cat

從grep 手冊頁：

-o ， - 只匹配
僅打印匹配行的匹配（非空）部分，每個此類部分位於單獨的輸出行上。

從uniq 手冊頁：

-u ， - 唯一
僅打印唯一的線條

Answer 2

如果你想顯示file1中不在file2中的單詞，那么一種骯臟的方式就是循環遍歷單詞和grep。 如果不匹配，請打印單詞：

while read word
do
    grep -q "$word" f2 || echo "$word"
done < f1

要匹配確切的單詞，請添加-w ： grep -wq ...

$ while read word; do grep -q "$word" f2 || echo "$word"; done < f1
Cat
$ while read word; do grep -wq "$word" f2 || echo "$word"; done < f1
Cat
Bat

更好的方法是使用awk：

$ awk 'FNR==NR {a[$1]; next} {for (i=1;i<=NF;i++) {if ($i in a) delete a[$i]}} END {for (i in a) print i}' f1 f2
Cat 
Bat

這會將file1中的值存儲到數組a[] 。 然后，它循環遍歷file2的所有行，檢查每個元素。 如果其中一個匹配數組a[]中的值，則從數組中刪除此元素。 最后，在END{}塊中打印未找到的值。