简体   繁体   English

Linux grep,如何显示不包含单词 1 和单词 2 的行但仍显示包含两个单词的行

[英]Linux grep, how can I display lines that don't contain word 1 and word 2 but still display the lines that have both words in them

I need some help with displaying all lines that don't contain word1 or word2 but lines that contain both of them have to be shown.我需要一些帮助来显示所有不包含word1word2的行,但必须显示包含这两个行的行。

Example:例子:

aaaa bbbb cccc
bbbb bbbb bbbb
cccc cccc cccc
dddd dddd aaaa

if word1 = aaaa and word2 = bbbb then output should be:如果word1 = aaaaword2 = bbbb那么 output 应该是:

aaaa bbbb cccc
cccc cccc cccc

Tried试过了

grep -Ewv "word1/word2" file.txt 

but this shows only lines that don't contain them, it doesn't show lines containing both但这仅显示不包含它们的行,它不显示包含两者的行

I need to do this with grep command, forgot to mention this我需要用 grep 命令来做这个,忘了提这个

Grep version of both or none of each: Grep 版本两者都有或都没有:

grep -v -P '((?=.*aaaa)(?!.*bbbb))|((?=.*bbbb)(?!.*aaaa))'

But please do not use grep in this case.但请不要在这种情况下使用grep Negative and positive look ahead can easily lead to Catastrophic Backtracking消极和积极的展望很容易导致灾难性的回溯

GNU grep knows Perl compatible regular expression (PCRE) syntax (option -P ). GNU grep知道 Perl 兼容正则表达式 (PCRE) 语法(选项-P )。 This thing is still called a "regular" expression, although it not regular anymore.这个东西仍然被称为“正则”表达式,尽管它不再是正则表达式了。 Other people are more explicit and call backtracking irregular expressions.其他人更明确,称回溯不规则表达式。

How it works:这个怎么运作:

(?=.*aaaa) matches aaaa anywhere in the line, but does not move the cursor. (?=.*aaaa)匹配行中任意位置的aaaa ,但不移动 cursor。 After the match the next search starts at the beginning of the line.匹配后,下一次搜索从行首开始。

(?..*bbbb) matches when no bbbb is in the line and does not move the cursor either. (?..*bbbb)匹配行中没有bbbb并且也不移动 cursor。

Both together matches lines, which include aaaa but do not include bbbb .两者一起匹配包含aaaa但不包含bbbb的行。

This is one of the cases you want to exclude, from your search results.这是您要从搜索结果中排除的情况之一。 The second behind the or condition ( | ) is the other one you want to exclude: any bbbb without a aaaa . or 条件 ( | ) 后面的第二个是您要排除的另一个:没有aaaa的任何bbbb

With the above, you have defined, what you do not want.通过以上内容,您已经定义了您不想要的内容。 Next use -v to invert the search to get what you want.接下来使用-v反转搜索以获得您想要的内容。

预期输出

Bash version of both or none of each: Bash 两个版本或一个都没有:

#! /bin/bash

word1=${1:-aaaa}
word2=${2:-bbbb}

while read -r line; do
  if [[ $line =~ $word1 ]]; then
    if [[ $line =~ $word2 ]]; then
      printf "%s\n" "$line"
    fi
  else
    if [[ $line =~ $word2 ]]; then
      :
    else
      printf "%s\n" "$line"
    fi
  fi
done

In my opinion, the simplest way (even though possibly not the fastest) is to find separately the lines that contain neither word and the lines that contain both words, and to concatenate the results.在我看来,最简单的方法(即使可能不是最快的方法)是分别查找不包含任何单词的行和包含两个单词的行,并将结果连接起来。 For example (assuming file.txt is a text file in directory test , and I pass the input values as environment variables for generality - and we are only looking for full words, not word fragments):例如(假设file.txt是目录test中的一个文本文件,我将输入值作为环境变量传递给一般性 - 我们只寻找完整的单词,而不是单词片段):

[mathguy@localhost test]$ more file.txt
aaaa bbbb cccc
bbbb bbbb bbbb
cccc cccc cccc
dddd dddd aaaa



[mathguy@localhost test]$ word1=aaaa
[mathguy@localhost test]$ word2=bbbb

[mathguy@localhost test]$ ( grep "\b$word1\b" file.txt | grep "\b$word2\b" ; \
>  grep -v "\b$word1\b" file.txt | grep -v "\b$word2\b" ) | cat
aaaa bbbb cccc
cccc cccc cccc

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM