查找單詞的所有出現+子字符串

Question

我有“主要”一詞“ LAUNCHER”，還有兩個其他詞“ LAUNCH”和“ LAUNCHER”。 我想找出（使用正則表達式），哪些詞在“主”詞中。 我使用帶有正則表達式的findAll：“（LAUNCH）|（LAUNCHER）”，但這只會返回LAUNCH而不是兩者。 我該如何解決？

import re
mainword = "launcher"
words = "(launch|launcher)"
matches = re.findall(words,mainword)
for match in matches:
  print(match)

Answer 1

您可以嘗試這樣的事情：

import re
mainword = "launcher"
words = "(launch|launcher)"
for x in (re.findall(r"[A-Za-z@#]+|\S", words)):
    if x in mainword:
        print (x)

結果：

發射

發射器

Answer 2

如果不需要使用正則表達式，則可以使用IN運算符和簡單的循環或列表理解來更有效地完成此操作：

mainWord = "launcher"
words    = ["launch","launcher"]
matches  = [ word for word in words if word in mainWord ] 

# case insensitive...
matchWord = mainWord.lower()
matches   = [ word for word in words if word.lower() in matchWord ]

即使您確實需要正則表達式，也將需要循環，因為re.findAll（）永遠不會匹配重疊的模式：

import re
pattern   = re.compile("launcher|launch")
mainWord  = "launcher"
matches   = []
startPos  = 0
lastMatch = None
while startPos < len(mainWord):
    if lastMatch : match = pattern.match(mainWord,lastMatch.start(),lastMatch.end()-1) 
    else         : match = pattern.match(mainWord,startPos)
    if not match: 
        if not lastMatch : break
        startPos  = lastMatch.start() + 1
        lastMatch = None
        continue
    matches.append(mainWord[match.start():match.end()])
    lastMatch = match

print(matches)

請注意，即使使用此循環，如果使用|，也需要讓較長的單詞出現在較短的單詞之前。 正則表達式中的運算符。 這是因為 永遠不會貪婪，並且會匹配第一個單詞，而不是最長的單詞。

查找單詞的所有出現+子字符串

問題描述

2 個解決方案

解決方案1
0 2019-02-22 13:54:07

解決方案2
0 2019-02-22 14:38:45

查找單詞的所有出現+子字符串

問題描述

2 個解決方案

解決方案1 0 2019-02-22 13:54:07

解決方案2 0 2019-02-22 14:38:45

解決方案1
0 2019-02-22 13:54:07

解決方案2
0 2019-02-22 14:38:45