Python：比較兩個字符串並返回它們共有的最長段

Question

作為Python的新手，我編寫了一個工作函數，該函數將比較兩個字符串並搜索兩個字符串共享的最長子字符串。 例如，當函數比較“ goggle”和“ google”時，它將“ go”和“ gle”標識為兩個常見的子字符串（不包括單個字母），但由於它是最長的，因此僅返回“ gle”。

我想知道我的代碼的任何部分是否可以改進/重寫，因為它可能被認為冗長且令人費解。 我也很高興看到解決方案的其他方法。 提前致謝！

def longsub(string1, string2):
    sublist = []
    i=j=a=b=count=length=0

    while i < len(string1):
        while j < len(string2):
            if string1[i:a+1] == string2[j:b+1] and (a+1) <= len(string1) and (b+1) <= len(string2):
                a+=1
                b+=1
                count+=1
            else:
                if count > 0:
                    sublist.append(string1[i:a])
                count = 0
                j+=1
                b=j
                a=i
        j=b=0
        i+=1
        a=i

    while len(sublist) > 1:
        for each in sublist:
            if len(each) >= length:
                length = len(each)
            else:
                sublist.remove(each)

    return sublist[0]

編輯：比較“凝視”和“谷歌”可能是一個不好的例子，因為它們的長度相等，最長的共同段在相同的位置。 實際輸入將更接近於此：“ xabcdkejp”和“ zkdieaboabcd”。 正確的輸出應為“ abcd”。

Answer 1

在標准庫中實際上恰好有一個函數： difflib.SequencMatcher.find_longest_match

Answer 2

編輯：僅當單詞在相同索引中具有最長的片段時，此算法才有效

您只需要一個循環就可以擺脫。 使用輔助變量。 像這樣的東西（需要重構） http://codepad.org/qErRBPav ：

word1 = "google"
word2 = "goggle"

longestSegment = ""
tempSegment = ""

for i in range(len(word1)):
    if word1[i] == word2[i]:
        tempSegment += word1[i]
    else: tempSegment = ""

    if len(tempSegment) > len(longestSegment):
        longestSegment = tempSegment

print longestSegment # "gle"

編輯：使用mgilson的建議find_longest_match （適用於不同段的位置）：

from difflib import SequenceMatcher

word1 = "google"
word2 = "goggle"

s = SequenceMatcher(None, word1, word2)
match = s.find_longest_match(0, len(word1), 0, len(word2))

print word1[match.a:(match.b+match.size)] # "gle"

Python：比較兩個字符串並返回它們共有的最長段

問題描述

2 個解決方案

解決方案1
4 已采納 2013-03-19 16:43:22

解決方案2
2 2013-03-19 16:55:35

Python：比較兩個字符串並返回它們共有的最長段

問題描述

2 個解決方案

解決方案1 4 已采納 2013-03-19 16:43:22

解決方案2 2 2013-03-19 16:55:35

解決方案1
4 已采納 2013-03-19 16:43:22

解決方案2
2 2013-03-19 16:55:35