简体   繁体   中英

Return substring of all lines that start with [+] between two specific lines

I have a sample multi-line string that looks like this:

[+] x: somerandomstuff
[!] blah
[+] x: somemorerandomstuff
[-] blah
[+] START
[+] x: 1st group to match
[!] blah
[-] blah
[+] x: 2nd group to match
[+] END

I want to match the strings after the x: in lines that look like [+] x: (...) , but only those that are between [+] START and [+] END . The expected result would be two groups (there could be more):

1st group to match
2nd group to match

Note that there will only be one instance of START/END.

I've only managed to come up with something that matches the first group:

\[\+\] START.*?\[\+\] x: (.*?)\n.*\[\+\] END

I currently lack the knowledge to extend this regex to match the other lines. I'm not sure how to look for multiple lines that match a pattern, between another pattern ( [+] START and [+] END )

REGEX101 Link: https://regex101.com/r/kCgwhr/2

note: I know that a regex-only solution may not be the best thing here, but I would like to solve this with only regex.

I assume that you use a PCRE compatible regex, as you are using regex101 in PCRE mode.

You can make use of the \\G continuous matching (and some lookahead stuff) to match what you want:

(?:\[\+\] START|\G(?!\A))\R(?:(?!\[\+\] x:)(?!\[\+\] END).*\R)*\[\+\] x:\s*\K.*

This matches:

  • (?:\\[\\+\\] START|\\G(?!\\A)) - the start sequence or right after the previous match. \\G matches at the start of the string the first time the regex is called, so (?!\\A) ensures that \\G is only used after the first match is found.
  • \\R - any newline sequence
  • (?:(?!\\[\\+\\] x:)(?!\\[\\+\\] END).*\\R)* - any amount of lines that neither start with the end sequence or the sequence we want to match (basically to skip over them)
  • \\[\\+\\] x:\\s* - starts the sequence we want to match
  • \\K - omits everything matched before (so we only match what we really want)
  • .* the content of our wanted line

See it working in regex 101 .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM