简体   繁体   English

grep中的正则表达式包含A,B,C ......但不包含Z的文件

[英]Regex in grep for files containing A,B,C… but not Z

Spent a few hours trying to answer this question on my own using partial answers to this question; 花了几个小时试图用这个问题的部分答案自己回答这个问题; so I apologize if this has already been answered, but combining the partial solutions I could find to properly perform this search seems to be beyond me. 所以我很抱歉,如果这已经得到了回答,但结合我能找到的部分解决方案来正确执行此搜索似乎超出了我的范围。

What I'm trying to do: Search through a directory for files containing multiple unique strings in any order, anywhere in the file, but not containing another certain string anywhere in the file. 我正在尝试做的事情:在目录中搜索包含多个唯一字符串的文件,文件中的任何位置,但不包含文件中任何位置的其他特定字符串。

Here's the search I have so far: 这是我到目前为止的搜索:

pcregrep -riM '^(?=.*uniquestringA)(?=.*uniquestringB)(?=.*uniquestringC)(?=.*uniquestringD)(?=.*uniquestringE).*$' . 
| xargs grep -Li 'uniquestringZ'

I realize that this is horribly, horribly wrong as I can't even seem to get the multi-line search to work while ignoring the order the strings appear. 我意识到这是可怕的,可怕的错误,因为我甚至似乎无法让多行搜索工作而忽略字符串出现的顺序。

Any help is greatly appreciated. 任何帮助是极大的赞赏。

If your grep has lookaheads, you should be able to do 如果你的grep有前瞻,你应该可以做到

^(?!.*Z)(?=.*A)(?=.*B)(?=.*C)(.*)$

See it work 看得出来了

With this file: 有了这个文件:

$ cat /tmp/grep_tgt.txt
A,B,C      # should match
A,B,C,D    # should match
A,C,D      # no match, lacking upper b
A,B,C,Z    # no match, has upper z

You can use perl one liner: 你可以使用perl one liner:

$ perl -ne 'print if /^(?!.*Z)(?=.*A)(?=.*B)(?=.*C)(.*)$/' /tmp/grep_tgt.txt
A,B,C      # should match
A,B,C,D    # should match

With file names: 使用文件名:

$ find . -type f
./.DS_Store
./A-B-C
./A-B-C-Z
./A-C-D
./sub/A-B-C-D

You can filter the file names with perl: 您可以使用perl过滤文件名:

$ find . -type f | perl -ne 'print if /^(?!.*Z)(?=.*A)(?=.*B)(?=.*C)(.*)$/'
./A-B-C
./sub/A-B-C-D

If you want to read the file contents to test for a pattern (like grep), you can do: 如果要读取文件内容以测试模式(如grep),可以执行以下操作:

$ find . -type f | xargs perl -ne 'print "$ARGV: $&\n" if /^
(?!.*Z)(?=.*A)(?=.*B)(?=.*C)(.*)$/'
./1.txt: A B C     # should match
./2.txt: A,B,C,D    # should match

where I put four files in a directory (1.txt .. 4.txt) with the text inside of 1.txt and 2.txt that match. 我把四个文件放在一个目录(1.txt .. 4.txt)中,文本里面的1.txt和2.txt匹配。

While it requires a lot of grep invocations, you can just write it out with find and grep in a simple and POSIX compliant way: 虽然它需要大量的grep调用,但您可以使用findgrep以简单且符合POSIX的方式编写它:

find . -type f \
  -exec grep -q "stringA" {} \; \
  -exec grep -q "stringB" {} \; \
  -exec grep -q "stringC" {} \; \
  -exec grep -q "stringD" {} \; \
  ! -exec grep -q "stringZ" {} \; \
  -print  # or whatever to do with matches

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM