简体   繁体   English

计算低于阈值的列表值的最快方法

[英]Fastest way to count list values below a threshold

Is there list method equivalent to numpy.count_nonzero(lst < t)? 是否有等效于numpy.count_nonzero(lst <t)的列表方法? when I use it on list (lst < t) just returns True instead of list of booleans. 当我在列表(lst <t)上使用它时,只返回True而不是布尔值列表。 I want to count list values below some threshold what is better - converting to numpy-array, using sort, some kind of list/generator comprehension or something else? 我想将列表值计算在某个阈值以下,哪个更好-转换为numpy-array,使用sort,某种类型的列表/生成器理解或其他方法?

Sorting is not recommended as it is O(N*logN), where all other soloutions are simply O(N). 不建议排序,因为它是O(N * logN),其中所有其他解决方案都只是O(N)。

You can use a generator expression and a generator-len function, like this: 您可以使用生成器表达式和generator-len函数,如下所示:

n = iterlen( x for x in lst if x < t )

This is better than list-comprehension becuase you don't need to construct the temporary list (of which you take the len), which takes up both time and memory. 这比列表理解要好,因为您不需要构造临时列表(使用len即可),这会占用时间和内存。

Depending on the details of the problem (list size, element type), converting to a numpy array might prove faster. 根据问题的详细信息(列表大小,元素类型),转换为numpy数组的速度可能会更快。 You should time both approaches, and see which one works best in your case. 您应该对两种方法都进行计时,然后看看哪种方法最适合您。

Of course, the best solution, if possible, would be to represent the list as a numpy array to begin with. 当然,如果可能的话,最好的解决方案是将列表表示为开始的numpy数组。 If you do that, numpy.count_nonzero(lst < t) is almost certain to be fastest. 如果这样做,几乎可以确定numpy.count_nonzero(lst < t)是最快的。

Or, if you can build a sorted list to begin with, you can easily implement a count_less function using bisect . 或者,如果您可以count_less开始建立排序列表,则可以使用bisect轻松实现count_less函数。 The complexity here is O(logN), which would be the fastest for large lists. 这里的复杂度是O(logN),对于大型列表而言这将是最快的。

c将是列表(lst)中低于值t的所有项目的计数:

c = len([i for i in lst if i < t])

You can use the cardinality package for this: 您可以为此使用基数包:

Usage: 用法:

>>> import cardinality
>>> cardinality.count(i for i in range(500) if i > 499)
1

The actual count() implementation is as follows: 实际的count()实现如下:

def count(iterable):
    if hasattr(iterable, '__len__'):
        return len(iterable)

    d = collections.deque(enumerate(iterable, 1), maxlen=1)
    return d[0][0] if d else 0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM