简体   繁体   English

Bash,使用grep,sed或awk提取部分文本,然后进行匹配

[英]Bash, Using grep, sed or awk to extract section of text and then match

I have a text file and want to extract all interfaces matching "blue" 我有一个文本文件,想提取所有匹配“蓝色”的接口


random text random text random text 
random text random text 

int 1
    random text
    blue
    random text
    random text
int 2
    random text
    random text
    red
    random text
int 3
    random text
    random text
    random text
    blue
    random text
    random text
int 4
    blue
    random text
int n
    random text
    value
    random text

random text random text random text 
random text random text

Wanted output: 想要的输出:

int 1
    blue
int 3
    blue
int 4
    blue
int n
    blue

(notice int 2 is "red" and therefore not displayed) (注意,int 2为“红色”,因此不显示)

I've tried: grep "int " -A n file.txt | 我试过了:grep“ int” -A n file.txt | grep "blue" but that only display lines matching "blue". grep“ blue”,但只显示与“ blue”匹配的行。 I want to also show the lines matching "int ". 我还要显示匹配“ int”的行。 Also the section length can vary so using -A n hasn't been useful. 另外,段的长度可以变化,因此使用-A n并没有用。

An awk solution could be the following: 一个awk解决方案可能是以下几种:

awk '/^int/{interface = $0} /blue/{print interface; print $0}' input.txt

It always saves the latest discovered interface. 它总是保存最新发现的接口。 If blue is found, it prints the stored interface and the line containing blue . 如果找到blue ,它将打印存储的接口和包含blue的行。

Another sed solution 另一个sed解决方案

Will work for multiple blues 适用于多个蓝调

sed -n '/^int/{x;/blue/{p;d}};/blue/H' file

Input 输入项

random text random text random text
random text random text

int 1
    random text
    blue
    blue
    random text
    random text
int 2
    random text
    random text
    red
    random text
int 3
    random text
    random text
    random text
    blue
    random text
    random text
int 4
    blue
    blue
    blue
    blue
    blue
    random text
int n
    random text
    value
    random text

random text random text random text
random text random text

Output 输出量

int 1
    blue
    blue
int 3
    blue
int 4
    blue
    blue
    blue
    blue
    blue

one possible GNU sed solution 一种可能的GNU sed解决方案

sed -n '/^int\|blue/p' file | sed -r ':a; N; $! ba; s/int \w*\n(int)/\1/g; s/int \w*$//' 

output 输出

int 1  
    blue  
int 3  
    blue  
int 4  
    blue 
sed '/^int/ h
     /^[[:space:]]*blue/ {x;G;p;}
     d
     ' YourFile
  • Assume there is 1 blue per paragraph and random text is not int or blue line 假设每个段落有1个蓝色并且随机文本不是int线
  • one liner possible (but less explicit) 可能有一支班轮(但不太明确)

added (post) constraint 添加(发布)约束

  • paragraphe are all int started, no other (like ext 1 , ...) 段落都是从int开始的,没有其他(如ext 1 ,...)

Explication: 解释:

  • keep int line when occur in buffer 在缓冲区中出现时保持int行
  • when blue occur, add last line (exchance buffers, add 2 buffer, so header than blue), print result {x;G;p;} (other action give the same depending of any other interest like H;x;p or H;g;p , in this case this is header destructive but it could be conservative using a s/// ) 当出现蓝色时,添加最后一行(额外的缓冲区,添加2个缓冲区,因此标题比蓝色多),打印结果{x;G;p;} (其他操作会根据H;x;pH;g;p不同给出相同的结果H;g;p ,在这种情况下,这是标头破坏性的,但使用s///可以是保守的
  • delete content (no printing and cycle to next line) 删除内容(不打印并循环到下一行)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM