在列表中查找包含重复元素的第二大元素

Question

I have a list with several very big values set on purpose to differentiate those indexes, it looks like this: 我有一个列表，其中故意设置了几个非常大的值以区分那些索引，看起来像这样：

a = [1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.]

I need to find the second largest value in that list which isn't equal to 9999. (in the case above it would be 7.4 ) in the most efficient way possible (my list can get quite big) 我需要以尽可能最有效的方式在该列表中找到第二个最大值，该最大值不等于9999. （在上述情况下为7.4 ）（我的列表会变得很大）

In this question Retrieve the two highest item from a list containing 100,000 integers the heapq.nlargest function is mentioned but since I have more than one value 9999. it wouldn't work. 在此问题中，从包含100,000个整数的列表中检索出两个最高的项目，但提到了heapq.nlargest函数，但是由于我有多个值9999.所以它将不起作用。

Answer 1

Here is an alternate method: 这是另一种方法：

>>> a = [1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.]
>>> sorted(set(a))[-2]
7.4
>>>

And, believe it or not, it is actually quite a lot faster than the accepted solution: 而且，信不信由你，它实际上比公认的解决方案快很多：

>>> from timeit import timeit
>>> timeit("a=range(10000000);print sorted(set(a))[-2]", number=10)
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
34.327036257401424
>>> # This is NPE's answer
>>> timeit("a=range(10000000);maxa = max(a);print max(val for val in a if val != maxa)", number=10)
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
9999998
53.22811809880869
>>>

Above is a test that runs 10 times and works with a list that contains 10,000,000 items. 上面的测试可以运行10次，并且可以处理包含10,000,000个项目的列表。 Unless there is a flaw in my test (which I don't think there is), the solution I gave is clearly much faster. 除非测试中存在缺陷（我认为没有缺陷），否则我提供的解决方案显然要快得多。

Answer 2

>>> max(val for val in a if val != 9999)
7.4

This has O(n) time complexity. 这具有O(n)时间复杂度。

If the 9999 isn't fixed, you can generalize this by using max(a) instead of 9999 : 如果9999不是固定的，则可以使用max(a)而不是9999来泛化它：

>>> maxa = max(a)
>>> max(val for val in a if val != maxa)
7.4

(Although I suspect this isn't what you want.) （尽管我怀疑这不是您想要的。）

Answer 3

a = set([1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.])
a.remove(max(a))
print max(a)

This uses set to make sure that we deal with only unique items and then we remove the maximum value, so that next time when we call max , we ll be left with the second best maximum number. 这使用set来确保我们只处理唯一的项目，然后删除最大值，以便下次调用max ，将剩下第二个最佳最大值。

Answer 4

If you want to use numpy you can use masked arrays to skip 'bad' values: 如果要使用numpy，则可以使用掩码数组跳过“错误”值：

import numpy as np
a = np.array([1.3, 2.1, 9999., 5., 3.7 ,6.6, 9999., 7.4, 9999., 3.5, 7, 1.2, 9999.])
ma = np.ma.masked_values(a, 9999., copy=False)
ma.max()
7.4

you can easily add exclusions to your mask: 您可以轻松地将排除项添加到蒙版中：

ma = np.ma.masked_values(ma, 7.4, copy=False)
ma.max()
7.0
ma.mask[ma>=5]=True   
ma.max()
3.7

在列表中查找包含重复元素的第二大元素

问题描述

4 个解决方案

解决方案1
5 已采纳 2013-10-16 20:35:53

解决方案2
3 2013-10-16 20:10:36

解决方案3
2 2013-10-16 20:11:34

解决方案4
0 2013-10-16 23:07:32

在列表中查找包含重复元素的第二大元素

问题描述

4 个解决方案

解决方案1 5 已采纳 2013-10-16 20:35:53

解决方案2 3 2013-10-16 20:10:36

解决方案3 2 2013-10-16 20:11:34

解决方案4 0 2013-10-16 23:07:32

解决方案1
5 已采纳 2013-10-16 20:35:53

解决方案2
3 2013-10-16 20:10:36

解决方案3
2 2013-10-16 20:11:34

解决方案4
0 2013-10-16 23:07:32