A multi-line, variedly greedy, regular expression

Question

Given the following text, what PCRE regular expression would you use to extract the parts marked in bold?



00:02314 quux
  padding
  dont want this

00:03124 foo
     neither this



00:02134 tralala
     not this

 but not this(!)

00:02134 foo bar
     and not this either

00:01234 dolor sit amet
     EOF

IOW, we want to extract sections that start, in regex terms, with "^0" and end with "(kryptonite|stalagmite)".

Been chomping on this for a bit, finding it a hard nut to crack. TIA!

Answer 1

One way to do this would be Negative Lookahead combined with inline (?sm) dotall and multi-line modifiers .

(?sm)^0(?:(?!^0).)*?(?:kryptonite|stalagmite)

Live Demo

Answer 2

This looks like it works.

 # (?ms)^0(?:(?!(?:^0|kryptonite|stalagmite)).)*(kryptonite|stalagmite)

 (?ms)
 ^ 0
 (?:
      (?!
           (?: ^ 0 | kryptonite | stalagmite )
      )
      . 
 )*
 ( kryptonite | stalagmite )

Answer 3

I believe this will be the most efficient:

^0(?:\R(?!\R)|.)*?\b(?:kryptonite|stalagmite)\b

Demo

Obviously we start with ^0 and then end with either kryptonite or stalagmite (in a non-capturing group, for the heck of it) surrounded by \\b word boundaries .

(?:\\R(?!\\R)|.)*? is the interesting part though, so let's break it down. One key concept first is PCRE's \\R newline sequence .

(?:      (?# start non-capturing group for repetition)
  \R     (?# match a newline character)
  (?!\R) (?# not followed by another newline)
 |       (?# OR)
  .      (?# match any character, except newline)
)*?      (?# lazily repeat this group)

Answer 4

具有s修饰符的^（00：。*？（kryptonite | stalagmite））

A multi-line, variedly greedy, regular expression

Question

4 answers

solution1
4 ACCPTED 2014-09-26 20:16:01

solution2
3 2014-09-26 20:23:58

solution3
2 2014-09-26 20:36:45

solution4
-1 2014-09-26 20:13:04

A multi-line, variedly greedy, regular expression

Question

4 answers

solution1 4 ACCPTED 2014-09-26 20:16:01

solution2 3 2014-09-26 20:23:58

solution3 2 2014-09-26 20:36:45

solution4 -1 2014-09-26 20:13:04

solution1
4 ACCPTED 2014-09-26 20:16:01

solution2
3 2014-09-26 20:23:58

solution3
2 2014-09-26 20:36:45

solution4
-1 2014-09-26 20:13:04