简体   繁体   English

grep/awk:有条件地排除单词

[英]grep/awk: exclude words conditionally

I have data formatted like this:我的数据格式如下:

a cat
a dog
brown cat
brown dog
brown cow
brown sheep
brown fish

I want to filter out all of the lines starting with "brown", except brown dog .我想过滤掉所有以“brown”开头的行,除了brown dog Is there an easy way to do this with grep or awk?有没有一种简单的方法可以使用 grep 或 awk 做到这一点? I tried to use the carat negation like so:我尝试像这样使用克拉否定:

grep -v "brown ^\(dog\)" corpus.txt

... but that didn't work. ......但这没有用。 Any ideas would be greatly appreciated.任何想法将不胜感激。

Eventually I want the output to be like this:最终我希望 output 是这样的:

a cat
a dog
brown dog

Using awk :使用awk

awk '/^brown dog/ || !/^brown/' file
a cat
a dog
brown dog

Just as an academic exercise here is a grep command without experimental PCRE option:就像这里的学术练习一样,是没有实验性PCRE选项的grep命令:

grep -vE '^brown($|[^ ]| ([^d]|$)| d([^o]|$)| do([^g]|$))' file

Yes sir:是的先生:

grep -vP '^brown (?!dog)' file
a cat
a dog
brown dog

-P for engine use. -P用于引擎。
Check explanations检查说明

awk '/^brown/ && !/dog$/{next} 1' file

Ok, it's past midnight over here.好的,这里已经过了午夜。 I'm going to post this awk:我要发布这个 awk:

 $ awk '!(/brown/ && !/dog/)' file

... and think it thru in the morning. ...并在早上想通了。 :D Good night. :D 晚安。

Nope, couldn't sleep, had to solve it:不,睡不着,必须解决它:

$ awk '!/^brown/ || /dog/' file

Output: Output:

a cat
a dog
brown dog

It's not clear if you specifically want to accept "brown dog" only, but perhaps you just want something like:目前尚不清楚您是否只想接受“棕色狗”,但也许您只想要类似的东西:

sed -e '/^brown/{/dog/!d;}'

This will delete all lines that start with "brown" unless they match the string "dog".这将删除所有以“brown”开头的行,除非它们与字符串“dog”匹配。 Or maybe you want to be stricter and do:或者,也许您想更严格并执行以下操作:

awk '!/^brown/ || $2 == "dog"'

Another awk :另一个awk

$ awk '!(/^brown/ && $2!="dog")' file
a cat
a dog
brown dog

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM