Attempting to implement top down DP with Python, believe caching is not working

Question

This is homework, but with the current lockdown, I'm unable to ask my tutor for help so I thought I'd as the internet :)

I am trying to implement specifically top-down (so caching) DP for a Longest Common Subsequence algorithm. And here is what I have written:

def lcs(s1, s2, known=None):
    if(known == None):
        known = {}
    if len(s1) == 0:
        return 0
    if len(s2) == 0:
        return 0
    else:
        if((s1, s2) in known):
            return known[(s1, s2)]
        elif(s1[-1] == s2[-1]):
            now_known = (1 + lcs(s1[:-1], s2[:-1]))
            known[(s1, s2)] = now_known
            return now_known
        else:
            now_known = max((lcs(s1, s2[:-1])), (lcs(s1[:-1], s2)))
            known[(s1, s2)] = now_known
            return now_known

My understanding of what I have wrriten is:

My code checks if either of the strings is empty, if that is the case, the longest common subsequence will be 0
My code then checks if the two strings it is checking is in the cache, if it is, it returns the values associated with it in the cache.
My code then checks if the last element in the two strings is the same, in which case, the longest subsequences will be 1 plus the lcs of the rest of the string.
Otherwise, my code recursivly calls itself twice, once for each string minus its last element

When the above code is run on two small strings:

s1 = "abcde"
s2 = "qbxxd"
lcs = lcs(s1, s2)
print(lcs)

I get the correct output of 2 :) However, when running on larger inputs, such as:

s1 = "Look at me, I can fly!"
s2 = "Look at that, it's a fly"
print(lcs(s1, s2))

My code times out. This, plus some print statement testing (testing what is added to "known" when I'm adding to the "known") leads me to believe I'm implementing my caching incorrectly. This could be a simple silly bug fix problem, in which case I apologize, but I believe this to be an issue with my understanding of caching. Also, I am aware of "lru cache" which does the caching for me, however for the purposes of this homework exercise, I'm required to write my own caching.

Any insight would be greatly appreciated.

Answer 1

As comments pointed out, you do not pass known in on the recursive calls. Therefore it always starts as None , then gets filled in with {} , and you never benefit from caching.

I personally am fond of the following pattern

def lcs(s1, s2):
    cache = {}

    def _lcs (i, j):
        if (i, j) not in cache:
            ... (logic here) ...

        return cache[(i, j)]

    return _lcs(0, 0)

The idea being that the cache and sequences are now created once, and you're just playing with indexes. I also don't forget to pass the cache around. :-)

Attempting to implement top down DP with Python, believe caching is not working

Question

1 answers

solution1
0 2020-03-23 22:34:31

Attempting to implement top down DP with Python, believe caching is not working

Question

1 answers

solution1 0 2020-03-23 22:34:31

solution1
0 2020-03-23 22:34:31