简体   繁体   English

Python:any() 意外性能

[英]Python: any() unexpected performance

I am comparing the performance of the any() built-in function with the actual implementation the docs suggest:我正在将any()内置函数的性能与文档建议的实际实现进行比较:

I am looking for an element greater than 0 in the following list:我在以下列表中寻找大于 0 的元素:

lst = [0 for _ in range(1000000)] + [1]

This is the supposedly equivalent function:这是所谓的等效函数:

def gt_0(lst):
    for elm in lst:
        if elm > 0:
            return True
    return False

And these are the results of the performance tests:这些是性能测试的结果:

>> %timeit any(elm > 0 for elm in lst)
>> 10 loops, best of 3: 35.9 ms per loop

>> %timeit gt_0(lst)
>> 100 loops, best of 3: 16 ms per loop

I would expect both of the to have the exact same performance, however any() if two times slower.我希望两者都具有完全相同的性能,但是any()如果慢两倍。 Why?为什么?

The reason is that you've passed a generator expression to the any() function.原因是您已将生成器表达式传递给any()函数。 Python needs to convert your generator expression to a generator function and that's why it performs slower. Python 需要将生成器表达式转换为生成器函数,这就是它执行速度较慢的原因。 Because a generator function needs to call the __next__() method each time for generating the item and passing it to the any .因为生成器函数每次都需要调用__next__()方法来生成项目并将其传递给any This is while in a manual defined function you are passing the whole list to your function which has all the items prepared already.这是在手动定义的函数中,您将整个列表传递给已准备好所有项目的函数。

You can see the difference better by using a list comprehension rather than a generator expression:通过使用列表推导式而不是生成器表达式,您可以更好地看到差异:

In [4]: %timeit any(elm > 0 for elm in lst)
10 loops, best of 3: 66.8 ms per loop

In [6]: test_list = [elm > 0 for elm in lst]

In [7]: %timeit any(test_list)
100 loops, best of 3: 4.93 ms per loop

Also another bottleneck in your code which has more cost than extra calls on next is the way you do the comparison.代码中另一个比next额外调用成本更高的瓶颈是您进行比较的方式。 As mentioned in comment the better equivalent of your manual function is:正如评论中提到的,更好的手动功能是:

any(True for elm in lst if elm > 0)

In this case you're doing the comparison with the generator expression and it'll perform almost in an equal time as your manual defined function (the slightest difference is because of the generator, I guess.) For a deeper understanding of the underlying reasons read the Ashwini 's answer.在这种情况下,您正在与生成器表达式进行比较,它的执行时间几乎与您手动定义的函数相同(我猜最细微的差异是因为生成器。)为了更深入地了解根本原因阅读Ashwini的回答。

Surely a loop over a generator expression is slower compared to a list.与列表相比,生成器表达式上的循环肯定更慢。 But in this case the iteration within the generator is basically a loop over the list itself, hence the next() calls on generator basically delegate to list's next() method.但在这种情况下,生成器内的迭代基本上是对列表本身的循环,因此对生成器的next()调用基本上委托给列表的next()方法。

For example in this case there is no 2x performance difference.例如,在这种情况下,没有 2 倍的性能差异。

>>> lst = list(range(10**5))

>>> %%timeit
... sum(x for x in lst)
...
100 loops, best of 3: 6.39 ms per loop

>>> %%timeit
... c = 0
... for x in lst: c += x
...

100 loops, best of 3: 6.69 ms per loop

First let's check the byte codes of both the approaches:首先让我们检查两种方法的字节码:

def gt_0(lst):
    for elm in lst:
        if elm > 0:
            return True
    return False


def any_with_ge(lst):
    return any(elm > 0 for elm in lst)

Bytecodes:字节码:

>>> dis.dis(gt_0)
 10           0 SETUP_LOOP              30 (to 33)
              3 LOAD_FAST                0 (lst)
              6 GET_ITER
        >>    7 FOR_ITER                22 (to 32)
             10 STORE_FAST               1 (elm)

 11          13 LOAD_FAST                1 (elm)
             16 LOAD_CONST               1 (0)
             19 COMPARE_OP               4 (>)
             22 POP_JUMP_IF_FALSE        7

 12          25 LOAD_GLOBAL              0 (True)
             28 RETURN_VALUE
             29 JUMP_ABSOLUTE            7
        >>   32 POP_BLOCK

 13     >>   33 LOAD_GLOBAL              1 (False)
             36 RETURN_VALUE
>>> dis.dis(any_with_ge.func_code.co_consts[1])
 17           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                17 (to 23)
              6 STORE_FAST               1 (elm)
              9 LOAD_FAST                1 (elm)
             12 LOAD_CONST               0 (0)
             15 COMPARE_OP               4 (>)
             18 YIELD_VALUE
             19 POP_TOP
             20 JUMP_ABSOLUTE            3
        >>   23 LOAD_CONST               1 (None)
             26 RETURN_VALUE

As you can see there's no jump condition in the any() version, it basically gets the value of the > comparison and then again checks its truthy value using PyObject_IsTrue again.如您所见, any()版本中没有跳转条件,它基本上获取>比较的值,然后再次使用PyObject_IsTrue再次检查其真值。 On the other hand the gt_0 checks the truthy value of the condition once and returns True or False based on that.另一方面, gt_0检查条件的True值一次,并基于此返回TrueFalse

Now let's add another any() based version that has an if-condition like in the for-loop.现在让我们添加另一个基于any()的版本,它具有类似 for 循环中的 if 条件。

def any_with_ge_and_condition(lst):
    return any(True for elm in lst if elm > 0)

Bytecode:字节码:

>>> dis.dis(any_with_ge_and_condition.func_code.co_consts[1])
 21           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                23 (to 29)
              6 STORE_FAST               1 (elm)
              9 LOAD_FAST                1 (elm)
             12 LOAD_CONST               0 (0)
             15 COMPARE_OP               4 (>)
             18 POP_JUMP_IF_FALSE        3
             21 LOAD_GLOBAL              0 (True)
             24 YIELD_VALUE
             25 POP_TOP
             26 JUMP_ABSOLUTE            3
        >>   29 LOAD_CONST               1 (None)
             32 RETURN_VALUE

Now we have reduced the work done by any() by adding the condition(check last section for more details) and it will have to check truthy twice only once when the condition is going to be True , else it will basically skip to next item.现在我们通过添加条件减少了any()所做的工作(查看上一节了解更多详细信息),当条件为True ,它只需要检查一次 truey 两次,否则它基本上会跳到下一项.


Now let's compare the timings of these 3:现在让我们比较一下这三个的时间:

>>> %timeit gt_0(lst)
10 loops, best of 3: 26.1 ms per loop
>>> %timeit any_with_ge(lst)
10 loops, best of 3: 57.7 ms per loop
>>> %timeit any_with_ge_and_condition(lst)
10 loops, best of 3: 26.8 ms per loop

Let's modify gt_0 to include two checks as in the simple any() version and check its timing.让我们修改gt_0以在简单的any()版本中包含两个检查并检查其时间。

from operator import truth
# This calls `PyObject_IsTrue` internally
# https://github.com/python/cpython/blob/master/Modules/_operator.c#L30


def gt_0_truth(lst, truth=truth): # truth=truth to prevent global lookups
    for elm in lst:
        condition = elm > 0
        if truth(condition):
            return True
    return False

Timing:时间:

>>> %timeit gt_0_truth(lst)
10 loops, best of 3: 56.6 ms per loop

Now, let's see what happens when we try to check truthy value of an item twice using operator.truth .现在,让我们看看当我们尝试使用operator.truth两次检查项目的真值时会发生什么。

>> %%timeit t=truth
... [t(i) for i in xrange(10**5)]
...
100 loops, best of 3: 5.45 ms per loop
>>> %%timeit t=truth
[t(t(i)) for i in xrange(10**5)]
...
100 loops, best of 3: 9.06 ms per loop
>>> %%timeit t=truth
[t(i) for i in xrange(10**6)]
...
10 loops, best of 3: 58.8 ms per loop
>>> %%timeit t=truth
[t(t(i)) for i in xrange(10**6)]
...
10 loops, best of 3: 87.8 ms per loop

That's quite a difference even though we are simply calling truth() (ie PyObject_IsTrue ) on an already boolean object, I guess that sort of explains the slowness of basic any() version.即使我们只是在一个已经是布尔值的对象上调用truth() (即PyObject_IsTrue ),这也是一个很大的不同,我想这可以解释基本any()版本的缓慢。


You may argue that if condition in any() will also result in two truthiness check, but that's not the case when the comparison operation returns Py_True or Py_False .您可能会争辩说any()中的if条件也会导致两次真实性检查,但当比较操作返回Py_TruePy_False时,情况并非如此。 POP_JUMP_IF_FALSE simply jumps to the next OP code and no call to PyObject_IsTrue is made. POP_JUMP_IF_FALSE只是跳转到下一个 OP 代码并且没有调用PyObject_IsTrue

The main chunk of performance boils down to the for loops.性能的主要部分归结为for循环。

In your any , there are two for loops: for elm in lst and the for loop carried out by any .any ,有两个 for 循环: for elm in lstany执行的 for 循环。 So, any iterates over a generator that looks like False, False, False, ..., True因此,任何对看起来像False, False, False, ..., True的生成器进行迭代

In your gt_0 , there is only one for loop.在您的gt_0 ,只有一个 for 循环。

If you change it to check if the element is truthy at all, so they both only loop once:如果您更改它以检查元素是否为真,则它们都只循环一次:

def _any(lst):
    for elm in lst:
        if elm:
            return True
    return False

_any(lst)
any(lst)

There is a clear winner:有一个明显的赢家:

$ python2 -m timeit "from test import lst, _any" "any(lst)"
100 loops, best of 3: 5.68 msec per loop

$ python2 -m timeit "from test import lst, _any" "_any(lst)"
10 loops, best of 3: 17 msec per loop
print(timeit('any(True for elm in lst if elm > 0)',setup='lst = [0 for _ in range(1000000)] + [1]', number=10))
print(timeit('any([elm > 0 for elm in lst])',setup='lst = [0 for _ in range(1000000)] + [1]', number=10))
print(timeit('any(elm > 0 for elm in lst)',setup='lst = [0 for _ in range(1000000)] + [1]', number=10))

produces:产生:

2.1382904349993623
3.1172365920028824
4.580027656000311

As explained by Kasramvd, the last version is slowest because it is using a generator expression;正如 Kasramvd 所解释的,最后一个版本最慢,因为它使用了生成器表达式; a list comprehension is a bit faster, but - surprisingly - using a generator expression with a conditional clause as proposed by Ashwini Chaudhary is even faster.列表推导要快一点,但是 - 令人惊讶的是 - 使用 Ashwini Chaudhary 提出的带有条件子句的生成器表达式甚至更快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM