在正則表達式方法中只獲得總匹配檢查python中的多個模式

Question

我想在文本中匹配多個表達式或單詞，如下所示

patterns = [r'(\bbmw\w*\b)', # bmw
            r'(\bopel\w?\b)', # opel
           r'(\btoyota\w?\b\s+(\w+\s+){0,2}(\bcorolla\w?\b\s+\bdiesel\w?\b))' # toyota corolla
           ]

# assume here that I am dealing with hundreds of regex coming from different coders.

text = 'there is a bmw and also an opel and also this span with toyota the nice corolla diesel'

def checkPatternInText(text, patterns):
        
    total_matches =[]
    
    for pattern in patterns:
        matches = re.findall(pattern, text)
        if len(matches)>0:
            print(type(matches))
        if type(matches[0]) == type('astring'):
            total_matches.append(matches[0])
        else: 
            total_matches.append(matches[0][0])
        print(matches)
   
    return total_matches
result = (checkPatternInText(text, patterns))

這種方法的結果是：

['bmw', 'opel', 'toyota the nice corolla diesel']

我檢查匹配項的類型，因為如果匹配項是單個單詞，則類型是字符串，如果模式產生多個匹配項，則匹配項是包含所有匹配項 -groups- 的元組。 從這個組元組中，我想要最長的一個，它是元組中的第一個，因此匹配 [0][0]。

有沒有更優雅的方法來做到這一點而無需檢查匹配項的變量類型？

作為第二個問題：我必須在所有模式周圍添加 () 才能訪問組 0，即 ALL THE MATCH。 如果模式周圍沒有 ()，您將如何進行？

有人建議這個問題在這里有一個答案： re.findall 行為很奇怪

情況並不完全相同，因為我在這里有一個模式集合，有些可能被 () 包圍，有些則沒有。 有些可能有組，有些可能沒有。 我正在嘗試獲得一種更可靠的解決方案，就像我提出的那樣。 當您處理單個模式時，您總是可以求助於修改模式（作為最后的手段），當您處理一組模式時，可能需要更通用的解決方案。

對三種情況制作1個正則表達式的解決方案不適用。 真實案例有大約 100 個不同的正則表達式，並且正在不斷添加越來越多的正則表達式。

Answer 1

您可以使用交替在re.findall中的單個正則表達式中實現此re.findall ：

\b(?:bmw|opel|toyota\s+(?:\w+\s+){0,2}corolla\s+diesel)\b

正則表達式演示

代碼：

>>> import re
>>> text = 'there is a bmw and also an opel and also this span with toyota the nice corolla diesel'
>>> print (re.findall(r'\b(?:bmw|opel|toyota\s+(?:\w+\s+){0,2}corolla\s+diesel)\b', text))
['bmw', 'opel', 'toyota the nice corolla diesel']

正則表達式詳情：

\\b : 字邊界
(?: : 啟動非捕獲組
- bmw ：匹配bmw
- | ：或者
- opel : 匹配opel
- | ：或者
- toyota\\s+(?:\\w+\\s+){0,2}corolla\\s+diesel : 匹配toyota子串
) : 結束非捕獲組
\\b : 字邊界

在正則表達式方法中只獲得總匹配檢查python中的多個模式

問題描述

1 個解決方案

解決方案1
1 2020-10-29 09:39:17

在正則表達式方法中只獲得總匹配檢查python中的多個模式

問題描述

1 個解決方案

解決方案1 1 2020-10-29 09:39:17

解決方案1
1 2020-10-29 09:39:17