简体   繁体   中英

Regular expression match in python

I'm trying to extract a certain part of the text from a file. I'm having trouble making the regular expression match the least number of characters as possible.

Here is an example text file.

UNIQUE
sdkjbskdfb....
UNIQUE
lnasdljnkjn......
UNIQUE
*Text from here is needed*
UNIQUE2
*Text from here is needed*
UNIQUE

My best effort was this. "UNIQUE(.\\*?)UNIQUE2(.\\*?)UNIQUE"

Unfortunately this matches the whole thing because it uses the first UNIQUE value instead of the third one.

You need a negative lookahead:

UNIQUE((?:(?!UNIQUE).)*?)UNIQUE2(.*?)UNIQUE

正则表达式可视化

Debuggex Demo

This says, find UNIQUE followed by some string that doesn't contain UNIQUE again before you hit UNIQUE2 , etc.

Let me know if you need clarification.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM