![](/img/trans.png)
[英]Python: check if any word in a list of words matches any pattern in a list of regular expression patterns
[英]check if a pattern is in a list of words
我需要一個包含與模式完全相同的單詞的輸出-例如,相同的字母僅在相同的位置(並且字母不應在其他位置的單詞中顯示)且長度相同,例如:
words = ['hatch','catch','match','chat','mates']
pattern = '_atc_
所需的輸出:
['hatch','match']
我嘗試使用嵌套的for循環,但是對於以'_'開頭和結尾的模式不起作用
def filter_words_list(words, pattern):
relevant_words = []
for word in words:
if len(word) == len(pattern):
for i in range(len(word)):
for j in range(len(pattern)):
if word[i] != pattern[i]:
break
if word[i] == pattern[i]:
relevant_words.append(word)
謝謝 !
您可以使用正則表達式 :
import re
words = ['hatch','catch','match','chat','mates']
pattern = re.compile('[^atc]atc[^atc]')
result = list(filter(pattern.fullmatch, words))
print(result)
輸出量
['hatch', 'match']
模式'[^atc]atc[^atc]'
匹配不是a或t或c的所有內容( [^atc]
),然后是'atc'
,再匹配不是a或t或c的所有內容。
或者,您可以編寫自己的匹配函數,該函數將與任何給定模式一起使用:
from collections import Counter
def full_match(word, pattern='_atc_'):
if len(pattern) != len(word):
return False
pattern_letter_counts = Counter(e for e in pattern if e != '_') # count characters that are not wild card
word_letter_counts = Counter(word) # count letters
if any(count != word_letter_counts.get(ch, 0) for ch, count in pattern_letter_counts.items()):
return False
return all(p == w for p, w in zip(pattern, word) if p != '_') # the word must match in all characters that are not wild card
words = ['hatch', 'catch', 'match', 'chat', 'mates']
result = list(filter(full_match, words))
print(result)
輸出量
['hatch', 'match']
進一步
因此,您應該使用正則表達式。 並用“。”替換下划線。 表示任何單個字符。 所以輸入看起來像:
words = ['hatch','catch','match','chat','mates']
pattern = '.atc.'
代碼是:
import re
def filter_words_list(words, pattern):
ret = []
for word in words:
if(re.match(pattern,word)):ret.append(word)
return ret
希望能有所幫助
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.