简体   繁体   中英

Trying to calculate algorithm time complexity

So last night I solved this LeetCode question . My solution is not great, quite slow. So I'm trying to calculate the complexity of my algorithm to compare with standard algorithms that LeetCode lists in the Solution section. Here's my solution:

class Solution:
    def longestCommonPrefix(self, strs: List[str]) -> str:
        # Get lengths of all strings in the list and get the minimum
        # since common prefix can't be longer than the shortest string.
        # Catch ValueError if list is empty
        try:
            min_len = min(len(i) for i in strs)
        except ValueError:
            return ''

        # split strings into sets character-wise
        foo = [set(list(zip(*strs))[k]) for k in range(min_len)]

        # Now go through the resulting list and check whether resulting sets have length of 1
        # If true then add those characters to the prefix list. Break as soon as encounter
        # a set of length > 1.
        prefix = []
        for i in foo:
            if len(i) == 1:
                x, = i
                prefix.append(x)
            else:
                break
        common_prefix = ''.join(prefix)
        return common_prefix

I'm struggling a bit with calculating complexity. First step - getting minimum length of strings - takes O(n) where n is number of strings in the list. Then the last step is also easy - it should take O(m) where m is the length of the shortest string.

But the middle bit is confusing. set(list(zip(*strs))) should hopefully take O(m) again and then we do it n times so O(mn). But then overall complexity is O(mn + m + n) which seems way too low for how slow the solution is.

The other option is that the middle step is O(m^2*n), which makes a bit more sense. What is the proper way to calculate complexity here?

Yes, the middle portion is O{mn} , as well the overall is O{mn} because that dwarfs the O{m} and O{n} terms for large values of m and n .

Your solution has an ideal order of runtime complexity.

Optimize: Short-Circuit

However, you are probably dismayed that others have faster solutions. I suspect that others likely short-circuit on the first non-matching index.

Let's consider a test case of 26 strings ( ['a'*500, 'b'*500, 'c'*500, ...] ). Your solution would proceed to create a list that is 500 long, with each entry containing a set of 26 elements. Meanwhile, if you short-circuited, you would only process the first index, ie one set of 26 characters.

Try changing your list into a generator . This might be all you need to short-circuit.

foo = (set(x) for x in zip(*strs)))

You can skip min_len check because default behaviour of zip is to iterate only as long as the shortest input.

Optimize: Generating Intermediate Results

I see that you append each letter to a list, then ''.join(lst) . This is efficient, especially compared to the alternative of iteratively appending to a string.

However, we could just as easily save a counter match_len . Then when we detect the first mis-match, just:

return strs[0][:match_len]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM