简体   繁体   中英

regex command line linux - select all lines between two strings

I have a text file with contents like this:

here is some super text:
  this is text that should
  be selected with a cool match
And this is how it all ends
blah blah...

I am trying to get the two lines (but could be more or less lines) between:

some super text:

and

And this is how

I am using grep on an ubuntu machine and a lot of the patterns I've found seem to be specific to different kinds of regex engines.

So I should end up with something like this:

grep "my regex goes here" myFileNameHere

Not sure if egrep is needed, but could use that just as easy.

You can use addresses in sed:

sed -e '/some super text/,/And this is how/!d' file

!d means "don't output if not in the range".

To exclude the border lines, you must be more clever:

sed -n -e '/some super text/ {n;b c}; d;:c {/And this is how/ {d};p;n;b c}' file

Or, similarly, in Perl:

perl -ne 'print if /some super text/ .. /And this is how/' file

To exclude the border lines again, change it to

perl -ne '$in = /some super text/ .. /And this is how/; print if $in > 1 and $in !~ /E/' file

I don't see how it could be done in grep . Using awk :

awk '/^And this is how/ {p=0}; p; /some super text:$/ {p=1}' file

Give a try to pcregrep instead of normal grep. Because normal grep won't help you to fetch multiple lines in a row.

$ pcregrep -M -o '(?s)some super text:[^\n]*\n\K.*?(?=\n[^\n]*And this is how)' file
  this is text that should
  be selected with a cool match
  • (?s) Dotall modifier allows dot to match even newline characters also.
  • \\K Discards the previously matched characters.

From pcregrep --help

-M, --multiline              run in multiline mode
-o, --only-matching=n        show only the part of the line that matched

TL;DR

With your corpus, another way to solve the problem is by matching lines with leading whitespace, rather than using a flip-flop operator of some sort to match start and end lines. The following solutions work with your posted example.

GNU Grep with PCRE Compiled In

$ grep -Po '^\s+\K.*' /tmp/corpus 
this is text that should
be selected with a cool match

Alternative: Use pcregrep Instead

$ pcregrep -o '^\s+\K.*' /tmp/corpus 
this is text that should
be selected with a cool match

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM