简体   繁体   English

如何编写一个函数来使用动态规划找到最长公共子序列?

[英]How to write a function to find the longest common subsequence using dynamic programming?

To be clear I am looking for the subsequence itself and not the length.明确地说,我正在寻找子序列本身而不是长度。 I have written this function which works the majority of the time but in some cases it doesn't work.我已经编写了这个函数,它在大多数情况下都可以工作,但在某些情况下它不起作用。 I have to write this recursively without any loops or imports.我必须在没有任何循环或导入的情况下递归地编写它。 I used a memoise function to be more efficient but didn't include it here.我使用了 memoise 功能来提高效率,但这里没有包含它。

This function works when s1 = "abcde" and s2 = "qbxxd" (which correctly returns "bd") but it doesn't work for when s1 = "Look at me, I can fly!"此函数在 s1 = "abcde" 和 s2 = "qbxxd"(正确返回 "bd")时起作用,但在 s1 = "看着我,我会飞!" 时不起作用。 and s2 = "Look at that, it's a fly" which should return "Look at , a fly" but I get instead "Look at a fly".和 s2 = “看那个,这是一只苍蝇”,它应该返回“看,一只苍蝇”,但我得到的是“看一只苍蝇”。 For whatever reason the comma and the space is ignored.无论出于何种原因,逗号和空格都会被忽略。 I've tried s1 = "ab, cde" and s2 = "qbxx, d" which correctly returns "b, d".我试过 s1 = "ab, cde" 和 s2 = "qbxx, d" 正确返回 "b, d"。

def lcs(s1, s2):
"""y5tgr"""
i = len(s1)
j = len(s2)
if i == 0 or j == 0:
    return ""
if s1[i-1] == s2[j-1]:
    return lcs(s1[:-1], s2[:-1]) + s1[i-1]
else:
    return max(lcs(s1[:-1], s2), lcs(s1, s2[:-1]))

I have a feeling the problem is with the last line and the max function.我感觉问题出在最后一行和 max 函数上。 I've seen solutions with for and while loops but not without.我见过有 for 和 while 循环的解决方案,但不是没有。

There's only a slight change to fix your code (you're right the problem was in max).只需稍作更改即可修复您的代码(您说得对,问题出在最大)。

Just change max so it finds the string of max length using it's key function.只需更改 max 以便它使用它的 key 函数找到最大长度字符串

def lcs(s1, s2):
    """y5tgr"""
    i = len(s1)
    j = len(s2)
    if i == 0 or j == 0:
        return ""
    if s1[i-1] == s2[j-1]:
        return lcs(s1[:-1], s2[:-1]) + s1[i-1]
    else:
        # Find max based upon the string length
        return max(lcs(s1[:-1], s2), lcs(s1, s2[:-1]), key=len)

However, this is very slow without memoization但是,这在没有记忆的情况下非常慢

Code with Memoization (to improve performance)带有记忆功能的代码(以提高性能)

Memoization Decorator Reference 记忆装饰器参考

import functools

def memoize(obj):
    cache = obj.cache = {}

    @functools.wraps(obj)
    def memoizer(*args, **kwargs):
        if args not in cache:
            cache[args] = obj(*args, **kwargs)
        return cache[args]
    return memoizer

@memoize
def lcs(s1, s2):
    """y5tgr"""
    i = len(s1)
    j = len(s2)
    if i == 0 or j == 0:
        return ""
    if s1[i-1] == s2[j-1]:
        return lcs(s1[:-1], s2[:-1]) + s1[i-1]
    else:
        return max(lcs(s1[:-1], s2), lcs(s1, s2[:-1]), key=len)

Test测试

s1 = "Look at me, I can fly!"
s2 = "Look at that, it's a fly"
print(lcs(s1, s2))

Output输出

Look at ,  a fly

For strings, max takes the string which lexicographically goes last:对于字符串, max取按字典序排在最后的字符串:

>>> max("a", "b")
'b'
>>> max("aaaaa", "b")
'b'
>>> 

Certainly not what you need;当然不是你需要的; you seem to look for the longer of the two.你似乎在寻找两者中较长的一个。 You don't need a loop, just a comparison:您不需要循环,只需比较:

lsc1 = lcs(s1[:-1], s2)
lcs2 = lcs(s1, s2[:-1])
return lcs1 if len(lcs1) > len(lcs2) else lcs2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM