简体   繁体   English

python中的正则表达式匹配

[英]Regular expression match in python

I'm trying to extract a certain part of the text from a file. 我正在尝试从文件中提取文本的特定部分。 I'm having trouble making the regular expression match the least number of characters as possible. 我在使正则表达式匹配最少字符数方面遇到麻烦。

Here is an example text file. 这是一个示例文本文件。

UNIQUE
sdkjbskdfb....
UNIQUE
lnasdljnkjn......
UNIQUE
*Text from here is needed*
UNIQUE2
*Text from here is needed*
UNIQUE

My best effort was this. 我最大的努力就是这样。 "UNIQUE(.\\*?)UNIQUE2(.\\*?)UNIQUE"

Unfortunately this matches the whole thing because it uses the first UNIQUE value instead of the third one. 不幸的是,这与整个事物匹配,因为它使用第一个UNIQUE值而不是第三个。

You need a negative lookahead: 您需要负前瞻:

UNIQUE((?:(?!UNIQUE).)*?)UNIQUE2(.*?)UNIQUE

正则表达式可视化

Debuggex Demo Debuggex演示

This says, find UNIQUE followed by some string that doesn't contain UNIQUE again before you hit UNIQUE2 , etc. 这就是说,在打UNIQUE2等之前,先找到UNIQUE然后再输入一些不再包含UNIQUE字符串。

Let me know if you need clarification. 让我知道您是否需要澄清。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM