简体   繁体   中英

Bash, Using grep, sed or awk to extract section of text and then match

I have a text file and want to extract all interfaces matching "blue"


random text random text random text 
random text random text 

int 1
    random text
    blue
    random text
    random text
int 2
    random text
    random text
    red
    random text
int 3
    random text
    random text
    random text
    blue
    random text
    random text
int 4
    blue
    random text
int n
    random text
    value
    random text

random text random text random text 
random text random text

Wanted output:

int 1
    blue
int 3
    blue
int 4
    blue
int n
    blue

(notice int 2 is "red" and therefore not displayed)

I've tried: grep "int " -A n file.txt | grep "blue" but that only display lines matching "blue". I want to also show the lines matching "int ". Also the section length can vary so using -A n hasn't been useful.

An awk solution could be the following:

awk '/^int/{interface = $0} /blue/{print interface; print $0}' input.txt

It always saves the latest discovered interface. If blue is found, it prints the stored interface and the line containing blue .

Another sed solution

Will work for multiple blues

sed -n '/^int/{x;/blue/{p;d}};/blue/H' file

Input

random text random text random text
random text random text

int 1
    random text
    blue
    blue
    random text
    random text
int 2
    random text
    random text
    red
    random text
int 3
    random text
    random text
    random text
    blue
    random text
    random text
int 4
    blue
    blue
    blue
    blue
    blue
    random text
int n
    random text
    value
    random text

random text random text random text
random text random text

Output

int 1
    blue
    blue
int 3
    blue
int 4
    blue
    blue
    blue
    blue
    blue

one possible GNU sed solution

sed -n '/^int\|blue/p' file | sed -r ':a; N; $! ba; s/int \w*\n(int)/\1/g; s/int \w*$//' 

output

int 1  
    blue  
int 3  
    blue  
int 4  
    blue 
sed '/^int/ h
     /^[[:space:]]*blue/ {x;G;p;}
     d
     ' YourFile
  • Assume there is 1 blue per paragraph and random text is not int or blue line
  • one liner possible (but less explicit)

added (post) constraint

  • paragraphe are all int started, no other (like ext 1 , ...)

Explication:

  • keep int line when occur in buffer
  • when blue occur, add last line (exchance buffers, add 2 buffer, so header than blue), print result {x;G;p;} (other action give the same depending of any other interest like H;x;p or H;g;p , in this case this is header destructive but it could be conservative using a s/// )
  • delete content (no printing and cycle to next line)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM