Python正則表達式：查找不包含子字符串的子字符串

Question

這是一個例子：

a = "one two three four five six one three four seven two"
m = re.search("one.*four", a)

我想要的是找到從“一”到“四”的子串，其中不包含子串“兩”。 答案應該是：m.group（0）=“一三四”，m.start（）= 28，m.end（）= 41

有沒有辦法用一條搜索線做到這一點？

Answer 1

您可以使用此模式：

one(?:(?!two).)*four

在匹配任何其他字符之前，我們檢查我們沒有開始匹配“兩個”。

工作示例： http ： //regex101.com/r/yY2gG8

Answer 2

隨着Satoru添加的更硬的字符串，這適用：

>>> import re
>>> a = "one two three four five six one three four seven two"
>>> re.findall("one(?!.*two.*four).*four", a)
['one three four']

但是 - 有一天 - 你真的會后悔寫一些棘手的正則表達式。 如果這是我需要解決的問題，我會這樣做：

for m in re.finditer("one.*?four", a):
    if "two" not in m.group():
        break

這很棘手，我在那里使用最小的匹配（ .*? ）。 Regexps可能是一個真正的痛苦:-(

編輯：哈哈！ 但是，如果你讓字符串變得更難，那么頂部的混亂正則表示再次失敗：

a = "one two three four five six one three four seven two four"

最后：這是一個正確的解決方案：

>>> a = 'one two three four five six one three four seven two four'
>>> m = re.search("one([^t]|t(?!wo))*four", a)
>>> m.group()
'one three four'
>>> m.span()
(28, 42)

我知道你說你希望m.end()為41，但這是不正確的。

Answer 3

你可以使用負前瞻斷言(?!...) ：

re.findall("one(?!.*two).*four", a)

Answer 4

另一個襯里有一個非常簡單的圖案

import re
line = "one two three four five six one three four seven two"

print [X for X in [a.split()[1:-1] for a in 
                     re.findall('one.*?four', line, re.DOTALL)] if 'two' not in X]

給我

>>> 
[['three']]

Python正則表達式：查找不包含子字符串的子字符串

問題描述

4 個解決方案

解決方案1
5 2013-11-03 06:10:19

解決方案2
1 2013-11-03 06:05:17

解決方案3
0 2013-11-03 05:26:17

解決方案4
0 2013-11-03 08:50:31

Python正則表達式：查找不包含子字符串的子字符串

問題描述

4 個解決方案

解決方案1 5 2013-11-03 06:10:19

解決方案2 1 2013-11-03 06:05:17

解決方案3 0 2013-11-03 05:26:17

解決方案4 0 2013-11-03 08:50:31

解決方案1
5 2013-11-03 06:10:19

解決方案2
1 2013-11-03 06:05:17

解決方案3
0 2013-11-03 05:26:17

解決方案4
0 2013-11-03 08:50:31