[英]Is there a python function that returns the first positive int that does not occur in list?
I'm tryin to design a function that, given an array A of N integers, returns the smallest positive integer (greater than 0) that does not occur in A. 我正在尝试设计一个函数,给定一个由N个整数组成的数组A,该函数返回A中不会出现的最小正整数(大于0)。
This code works fine yet has a high order of complexity, is there another solution that reduces the order of complexity? 该代码可以正常工作,但是具有很高的复杂度,是否有另一种解决方案可以降低复杂度?
Note: The 10000000 number is the range of integers in array A, I tried the sort
function but does it reduces the complexity? 注意:10000000数字是数组A中整数的范围,我尝试了
sort
函数,但是它是否降低了复杂度?
def solution(A):
for i in range(10000000):
if(A.count(i)) <= 0:
return(i)
The following is O(n logn) : 以下是O(n logn) :
a = [2, 1, 10, 3, 2, 15]
a.sort()
if a[0] > 1:
print(1)
else:
for i in range(1, len(a)):
if a[i] > a[i - 1] + 1:
print(a[i - 1] + 1)
break
If you don't like the special handling of 1
, you could just append zero to the array and have the same logic handle both cases: 如果你不喜欢的特殊处理
1
,你可以只追加零到阵列,并且具有相同的逻辑处理这两种情况:
a = sorted(a + [0])
for i in range(1, len(a)):
if a[i] > a[i - 1] + 1:
print(a[i - 1] + 1)
break
Caveats (both trivial to fix and both left as an exercise for the reader): 注意事项(两者都是不重要的,都留给读者练习):
O(n) time and O(n) space: O(n)时间和O(n)空间:
def solution(A):
count = [0] * len(A)
for x in A:
if 0 < x <= len(A):
count[x-1] = 1 # count[0] is to count 1
for i in range(len(count)):
if count[i] == 0:
return i+1
return len(A)+1 # only if A = [1, 2, ..., len(A)]
This should be O(n). 这应该是O(n)。 Utilizes a temporary set to speed things along.
利用临时集加快进度。
a = [2, 1, 10, 3, 2, 15]
#use a set of only the positive numbers for lookup
temp_set = set()
for i in a:
if i > 0:
temp_set.add(i)
#iterate from 1 upto length of set +1 (to ensure edge case is handled)
for i in range(1, len(temp_set) + 2):
if i not in temp_set:
print(i)
break
My proposal is a recursive function inspired by quicksort. 我的建议是受quicksort启发的递归函数。
Each step divides the input sequence into two sublists (lt = less than pivot; ge = greater or equal than pivot) and decides, which of the sublists is to be processed in the next step. 每个步骤将输入序列分为两个子列表(lt =小于枢轴; ge =大于或等于枢轴),并确定下一步将处理哪个子列表。 Note that there is no sorting.
请注意,没有排序。
The idea is that a set of integers such that lo <= n < hi contains "gaps" only if it has less than (hi - lo) elements. 这个想法是,仅当lo <= n <hi的整数集具有少于(hi-lo)个元素时,才包含“间隙”。
The input sequence must not contain dups. 输入序列不得包含dups。 A set can be passed directly.
一组可以直接传递。
# all cseq items > 0 assumed, no duplicates!
def find(cseq, cmin=1):
# cmin = possible minimum not ruled out yet
size = len(cseq)
if size <= 1:
return cmin+1 if cmin in cseq else cmin
lt = []
ge = []
pivot = cmin + size // 2
for n in cseq:
(lt if n < pivot else ge).append(n)
return find(lt, cmin) if cmin + len(lt) < pivot else find(ge, pivot)
test = set(range(1,100))
print(find(test)) # 100
test.remove(42)
print(find(test)) # 42
test.remove(1)
print(find(test)) # 1
Inspired by various solutions and comments above, about 20%-50% faster in my (simplistic) tests than the fastest of them (though I'm sure it could be made faster), and handling all the corner cases mentioned (non-positive numbers, duplicates, and empty list): 受到上述各种解决方案和评论的启发,我(简单的)测试比最快的测试(虽然我敢肯定它可以更快)的处理速度快约20%-50%,并且可以处理所有提到的极端情况(非肯定)数字,重复项和空白列表):
import numpy
def firstNotPresent(l):
positive = numpy.fromiter(set(l), dtype=int) # deduplicate
positive = positive[positive > 0] # only keep positive numbers
positive.sort()
top = positive.size + 1
if top == 1: # empty list
return 1
sequence = numpy.arange(1, top)
try:
return numpy.where(sequence < positive)[0][0]
except IndexError: # no numbers are missing, top is next
return top
The idea is: if you enumerate the positive, deduplicated, sorted list starting from one, the first time the index is less than the list value, the index value is missing from the list, and hence is the lowest positive number missing from the list. 这个想法是:如果您枚举从一个开始的正,去重,排序列表,则第一次索引小于列表值时,索引值将从列表中丢失,因此是列表中缺失的最低正数。
This and the other solutions I tested against (those from adrtam , Paritosh Singh , and VPfB ) all appear to be roughly O(n), as expected. 这个和
我测试过的其他解决方案(来自adrtam , Paritosh Singh和VPfB的解决方案)都像预期的那样大致为O(n)。 (It is, I think, fairly obvious that this is a lower bound, since every element in the list must be examined to find the answer.) Edit: looking at this again, of course the big-O for this approach is at least O(n log(n)), because of the sort.
(我认为,这很明显是一个下限,因为必须检查列表中的每个元素才能找到答案。)编辑:再次查看此内容,当然,这种方法的big-O至少是O(n log(n)),因为排序。 It's just that the sort is so fast comparitively speaking that it looked linear overall.
仅仅是比较而言,排序是如此之快,以至于总体上看起来是线性的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.