Python遍歷字符串並與通配符模式匹配

Question

string1="abc"
string2="abdabcdfg"

我想查找string1是否為string2的子字符串。 但是，存在通配符，例如"." 可以是任何字母， y可以是"a"或"d" ， x可以是"b"或"c" 。 結果， ".yx"將是string2子字符串。

如何僅使用一個循環對其進行編碼？ 我想遍歷string2並在每個索引處進行比較。 我嘗試了字典，但我想使用循環代碼：

def wildcard(string,substring):
    sum=""
    table={'A': '.', 'C': '.', 'G': '.', 'T': '.','A': 'x', 'T': 'x', 'C': 'y', 'G': 'y'}
    for c in strand:
        if (c in table) and table[c] not in sum:
            sum+=table[c]
        elif c not in table:
            sum+=c
    if sum==substring:
        return True
    else:
        return False

print wildcard("TTAGTTA","xyT.")#should be true

Answer 1

我知道您是專門要求使用循環的解決方案。 但是，我想采用另一種方法：您可以輕松地將模式轉換為正則表達式。 這是一種類似於字符串模式的語言，但功能更強大。 然后，您可以使用re模塊來檢查是否可以在字符串中找到該正則表達式（以及您的子字符串模式）。

def to_regex(pattern, table):
    # join substitutions from table, using c itself as default
    return ''.join(table.get(c, c) for c in pattern)

import re
symbols = {'.': '[a-z]', '#': '[ad]', '+': '[bc]'}
print re.findall(to_regex('.+#', symbols), 'abdabcdfg')

如果您更喜歡“動手”解決方案，則可以使用循環來使用它。

def find_matches(pattern, table, string):
    for i in range(len(string) - len(pattern) + 1):
        # for each possible starting position, check the pattern
        for j, c in enumerate(pattern):
            if string[i+j] not in table.get(c, c):
                break # character does not match
        else:
            # loop completed without triggering the break
            yield string[i : i + len(pattern)]

symbols = {'.': 'abcdefghijklmnopqrstuvwxyz', '#': 'ad', '+': 'bc'}
print list(find_matches('.+#', symbols, 'abdabcdfg'))

兩種情況下的輸出均為['abd', 'bcd'] ，即使用這些替換可以找到兩次。

Python遍歷字符串並與通配符模式匹配

問題描述

1 個解決方案

解決方案1
1 2014-07-10 13:23:31

Python遍歷字符串並與通配符模式匹配

問題描述

1 個解決方案

解決方案1 1 2014-07-10 13:23:31

解決方案1
1 2014-07-10 13:23:31