Regular expression match in python

Question

I'm trying to extract a certain part of the text from a file. I'm having trouble making the regular expression match the least number of characters as possible.

Here is an example text file.

UNIQUE
sdkjbskdfb....
UNIQUE
lnasdljnkjn......
UNIQUE
*Text from here is needed*
UNIQUE2
*Text from here is needed*
UNIQUE

My best effort was this. "UNIQUE(.\\*?)UNIQUE2(.\\*?)UNIQUE"

Unfortunately this matches the whole thing because it uses the first UNIQUE value instead of the third one.

Answer 1

You need a negative lookahead:

UNIQUE((?:(?!UNIQUE).)*?)UNIQUE2(.*?)UNIQUE

正则表达式可视化

Debuggex Demo

This says, find UNIQUE followed by some string that doesn't contain UNIQUE again before you hit UNIQUE2 , etc.

Let me know if you need clarification.

Regular expression match in python

Question

1 answers

solution1
1 ACCPTED 2014-03-11 05:41:09

Regular expression match in python

Question

1 answers

solution1 1 ACCPTED 2014-03-11 05:41:09

solution1
1 ACCPTED 2014-03-11 05:41:09