python中的正则表达式匹配

Question

I'm trying to extract a certain part of the text from a file. 我正在尝试从文件中提取文本的特定部分。 I'm having trouble making the regular expression match the least number of characters as possible. 我在使正则表达式匹配最少字符数方面遇到麻烦。

Here is an example text file. 这是一个示例文本文件。

UNIQUE
sdkjbskdfb....
UNIQUE
lnasdljnkjn......
UNIQUE
*Text from here is needed*
UNIQUE2
*Text from here is needed*
UNIQUE

My best effort was this. 我最大的努力就是这样。 "UNIQUE(.\\*?)UNIQUE2(.\\*?)UNIQUE"

Unfortunately this matches the whole thing because it uses the first UNIQUE value instead of the third one. 不幸的是，这与整个事物匹配，因为它使用第一个UNIQUE值而不是第三个。

Answer 1

You need a negative lookahead: 您需要负前瞻：

UNIQUE((?:(?!UNIQUE).)*?)UNIQUE2(.*?)UNIQUE

正则表达式可视化

Debuggex Demo Debuggex演示

This says, find UNIQUE followed by some string that doesn't contain UNIQUE again before you hit UNIQUE2 , etc. 这就是说，在打UNIQUE2等之前，先找到UNIQUE然后再输入一些不再包含UNIQUE字符串。

Let me know if you need clarification. 让我知道您是否需要澄清。

python中的正则表达式匹配

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-03-11 05:41:09

python中的正则表达式匹配

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-03-11 05:41:09

解决方案1
1 已采纳 2014-03-11 05:41:09