O（N）简单Python函数的时间复杂度

Question

I just took a Codility demo test. 我刚刚参加了Codility演示测试。 The question and my answer can be seen here , but I'll paste my answer here as well. 在这里可以看到问题和我的答案，但是我也会在这里粘贴我的答案。 My response: 我的回复：

def solution(A):
    # write your code in Python 2.7

    retresult = 1; # the smallest integer we can return, if it is not in the array

    A.sort()
    for i in A:
        if i > 0:
            if i==retresult:   retresult += 1 # increment the result since the current result exists in the array
            elif i>retresult:   break # we can go out of the loop since we found a bigger number than our current positive integer result


    return retresult

My question is around time complexity, which I hope to better understand by your response. 我的问题是时间复杂度，希望您的回答能更好地理解。 The question asks for expected worst-case time complexity is O(N) . 问题要求预期的最坏情况下的时间复杂度为O（N） 。

Does my function have O(N) time complexity? 我的函数是否具有O（N）时间复杂度？ Does the fact that I sort the array increase the complexity, and if so how? 我对数组进行排序的事实是否会增加复杂性？

Codility reports (for my answer) 编译报告（我的回答）

Detected time complexity: 
O(N) or O(N * log(N))

So, what is the complexity for my function? 那么，我的函数的复杂度是多少？ And if it is O(N*log(N)), what can I do to decrease the complexity to O(N) as the problem states? 如果它是O（N * log（N）），那么当问题出现时，我该怎么做才能降低O（N）的复杂度？

Thanks very much! 非常感谢！

ps my background reading on time complexity comes from this great post . ps我关于时间复杂性的背景阅读来自这篇很棒的文章。

EDIT 编辑

Following the reply below, and the answers described here for this problem , I would like to expand on this with my take on the solutions: 遵循以下答复以及此处针对此问题描述的答案，我想在此方面介绍解决方案：

basicSolution has an expensive time complexity and so is not the right answer for this Codility test: basicSolution的时间复杂度很高，因此对于此Codility测试不是正确的答案：

def basicSolution(A):
    # 0(N*log(N) time complexity

    retresult = 1; # the smallest integer we can return, if it is not in the array

    A.sort()
    for i in A:
        if i > 0:
            if i==retresult:   retresult += 1 #increment the result since the current result exists in the array
            elif i>retresult:   break # we can go out of the loop since we found a bigger number than our current positive integer result
        else:
            continue; # negative numbers and 0 don't need any work

    return retresult

hashSolution is my take on what is described in the above article, in the "use hashing" paragraph. hashSolution是我对上面文章“使用哈希”段落中描述的内容的看法。 As I am new to Python, please let me know if you have any improvements to this code (it does work though against my test cases), and what time complexity this has? 由于我是Python的新手，请告诉我您是否对此代码进行了任何改进（尽管确实适用于我的测试用例），以及它的时间复杂度如何？

def hashSolution(A):
    # 0(N) time complexity, I think? but requires 0(N) extra space (requirement states to use 0(N) space

    table = {}

    for i in A:
        if i > 0:
            table[i] = True # collision/duplicate will just overwrite

    for i in range(1,100000+1): # the problem says that the array has a maximum of 100,000 integers
        if not(table.get(i)): return i

    return 1 # default

Finally, the actual 0(N) solution (O(n) time and O(1) extra space solution) I am having trouble understanding. 最后，实际的0（N）解决方案（O（n）时间和O（1）额外空间解决方案）让我难以理解。 I understand that negative/0 values are pushed at the back of the array, and then we have an array of just positive values. 我了解到负数/ 0值被推到数组的后面，然后我们得到的数组只是正值。 But I do not understand the findMissingPositive function - could anyone please describe this with Python code/comments? 但是我不理解findMissingPositive函数-有人可以用Python代码/注释来描述吗？ With an example perhaps? 也许有一个例子？ I've been trying to work through it in Python and just cannot figure it out :( 我一直在尝试用Python来解决它，但无法弄清楚:(

Answer 1

It does not, because you sort A . 不会，因为您对A排序。

The Python list.sort() function uses Timsort (named after Tim Peters), and has a worst-case time complexity of O(NlogN). Python list.sort()函数使用Timsort （以Tim Peters命名），并且在最坏情况下的时间复杂度为O（NlogN）。

Rather than sort your input, you'll have to iterate over it and determine if any integers are missing by some other means. 不必对输入进行排序，而必须对其进行迭代，并通过其他方法确定是否缺少任何整数。 I'd use a set of a range() object: 我会使用一组range()对象：

def solution(A):
    expected = set(range(1, len(A) + 1))
    for i in A:
        expected.discard(i)
    if not expected:
        # all consecutive digits for len(A) were present, so next is missing
        return len(A) + 1
    return min(expected)

This is O(N); 这是O（N）; we create a set of len(A) (O(N) time), then we loop over A , removing elements from expected (again O(N) time, removing elements from a set is O(1)), then test for expected being empty (O(1) time), and finally get the smallest element in expected (at most O(N) time). 我们创建了一个len(A) （O（N）时间）的集合，然后遍历A ，从expected删除元素（再次为O（N）时间，从集合中删除元素为O（1）），然后测试expected为空（O（1）时间），最后得到expected的最小元素（最多O（N）时间）。

So we make at most 3 O(N) time steps in the above function, making it a O(N) solution. 因此，我们在上述函数中最多进行3个O（N）时间步长，使其成为O（N）解决方案。

This also fits the storage requirement; 这也符合存储要求； all use is a set of size N. Sets have a small overhead, but always smaller than N. 所有使用的都是大小为N的集合。集合的开销较小，但始终小于N。

The hash solution you found is based on the same principle, except that it uses a dictionary instead of a set. 您发现的哈希解决方案基于相同的原理，除了它使用字典而不是集合。 Note that the dictionary values are never actually used, they are either set to True or absent. 请注意，字典值从未实际使用过，它们被设置为True或不存在。 I'd rewrite that as: 我将其重写为：

def hashSolution(A):
    seen = {i for i in A if i > 0}
    if not seen:
        # there were no positive values, so 1 is the first missing.
        return 1
    for i in range(1, 10**5 + 1):
        if i not in seen:
            return i
    # we can never get here because the inputs are limited to integers up to
    # 10k. So either `seen` has a limited number of positive values below
    # 10.000 or none at all.

The above avoids looping all the way to 10.000 if there were no positive integers in A . 如果A中没有正整数，则上述方法避免一直循环到10.000。

The difference between mine and theirs is that mine starts with the set of expected numbers, while they start with the set of positive values from A , inverting the storage and test. mine和他们的区别在于，mine从一组期望的数字开始，而从A组正值开始，从而反转存储和测试。

O（N）简单Python函数的时间复杂度

问题描述

1 个解决方案

解决方案1
6 已采纳 2017-11-15 17:18:48

O（N）简单Python函数的时间复杂度

问题描述

1 个解决方案

解决方案1 6 已采纳 2017-11-15 17:18:48

解决方案1
6 已采纳 2017-11-15 17:18:48