简体   繁体   English

比较列表推导和显式循环(3个数组生成器比1更快地循环)

[英]Comparing list comprehensions and explicit loops (3 array generators faster than 1 for loop)

I did homework and I accidentally found a strange inconsistency in the speed of the algorithm. 我做了功课,我意外地发现了算法的速度奇怪的不一致。 Here is 2 versions of code of same function bur with 1 difference: in first version i use 3 times array generator to filter some array and in second version i use 1 for loop with 3 if statements to do same filter work. 这是相同功能bur的2个版本的代码bur与1差异:在第一个版本中我使用3倍数组生成器来过滤一些数组,在第二个版本中我使用1 for循环与3 if语句进行相同的过滤工作。

So, here is code of 1st version: 所以,这是第一版的代码:

def kth_order_statistic(array, k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = [x for x in array if x < pivot]
    m = [x for x in array if x == pivot]
    r = [x for x in array if x > pivot]
    if k <= len(l):
            return kth_order_statistic(l, k)
    elif k > len(l) + len(m):
            return kth_order_statistic(r, k - len(l) - len(m))
    else:
            return m[0]

And here code of 2nd version: 这里是第二版的代码:

def kth_order_statistic2(array, k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = []
    m = []
    r = []
    for x in array:
        if x < pivot:
            l.append(x)
        elif x > pivot:
            r.append(x)
        else:
            m.append(x)

    if k <= len(l):
        return kth_order_statistic2(l, k)
    elif k > len(l) + len(m):
        return kth_order_statistic2(r, k - len(l) - len(m))
    else:
        return m[0]

IPython output for 1st version: 第一版的IPython输出:

In [4]: %%timeit
   ...: A = range(100000)
   ...: shuffle(A)
   ...: k = randint(1, len(A)-1)
   ...: order_statisctic(A, k)
   ...:
10 loops, best of 3: 120 ms per loop

And for 2nd version: 对于第二版:

In [5]: %%timeit
   ...: A = range(100000)
   ...: shuffle(A)
   ...: k = randint(1, len(A)-1)
   ...: kth_order_statistic2(A, k)
   ...:
10 loops, best of 3: 169 ms per loop

So why first version is faster than second? 那么为什么第一版比第二版快呢? I also made third version wich using filter() function instead of array generator and it was slower than second version (it got 218 ms per loop) 我还使用filter()函数而不是数组生成器制作了第三个版本,它比第二个版本慢(它每个循环得到218毫秒)

Using simple for is faster than list comprehesion . 使用simple forlist comprehesion更快。 It is almost 2 times faster. 它快了近2倍。 Check below results: 检查以下结果:

Using list comprehension : 58 usec 使用list comprehension58 usec

moin@moin-pc:~$ python -m timeit "[i for i in range(1000)]"
10000 loops, best of 3: 58 usec per loop

Using for loop: 37.1 usec 使用for循环: 37.1 usec

moin@moin-pc:~$ python -m timeit "for i in range(1000): i"
10000 loops, best of 3: 37.1 usec per loop

But in your case, for is taking more time than list comprehension not because YOUR for loop is slow. 但是,在你的情况, for正在花费更多的时间比列表理解,不是因为你的循环很慢。 But because of .append() you are using within the code. 但是因为.append()你在代码中使用。

With append() in for loop`: 114 usec 使用append() in for循环`: 114 usec

moin@moin-pc:~$ python -m timeit "my_list = []" "for i in range(1000): my_list.append(i)"
10000 loops, best of 3: 114 usec per loop

Which clearly shows that it is .append() which is taking twice the time taken by for loop . 这清楚地表明它是.append() ,它占用了for循环所用时间的两倍

However, on storing the "list.append" in different variable : 69.3 usec 但是, storing the "list.append" in different variable69.3 usec

moin@moin-pc:~$ python -m timeit "my_list = []; append = my_list.append" "for i in range(1000): append(i)"
10000 loops, best of 3: 69.3 usec per loop

There is a huge improvement in the performance as compared to the last case in above comparisons, and result is quite comparable to that of list comprehension . 与上面比较中的最后一个案例相比,性能有了很大的提高,结果与list comprehension结果相当。 That means, instead of calling my_list.append() each time, the performance can be improved by storing the reference of function in another variable ie append_func = my_list.append and making a call using that variable append_func(i) . 这意味着,不是每次调用my_list.append() ,而是通过将函数的引用存储在另一个变量(即append_func = my_list.append并使用该变量append_func(i)进行调用来提高性能。

Which also proves, it is faster to call class's function stored in the variable as compared to directly making the function call using object of the class . 这也证明, 与直接使用类的对象进行函数调用相比,调用存储在变量中的类函数更快

Thank You Stefan for bringing the last case in notice. 感谢Stefan提出最后一个案例。

Let's define the functions we will need to answer the question and timeit them: 让我们定义回答问题所需的函数并将它们计时:

In [18]: def iter():
    l = [x for x in range(100) if x > 10]
   ....:

In [19]: %timeit iter()
100000 loops, best of 3: 7.92 µs per loop

In [20]: def loop():
    l = []
    for x in range(100):
        if x > 10:
            l.append(x)
   ....:

In [21]: %timeit loop()
10000 loops, best of 3: 20 µs per loop

In [22]: def loop_fast():
    l = []
    for x in range(100):
        if x > 10:
            pass
   ....:

In [23]: %timeit loop_fast()
100000 loops, best of 3: 4.69 µs per loop

we can see that the for loops without the append command is as fast as the list comprehension. 我们可以看到没有append命令的for循环和列表理解一样快。 In fact, if we have a look at the bytecode we can see that in the case of the list comprehension python is able to use a built-in bytecode command called LIST_APPEND instead of: 实际上,如果我们看一下字节码,我们可以看到,在列表解析的情况下,python能够使用一个名为LIST_APPEND的内置字节码命令,而不是:

  • Load the list: 40 LOAD_FAST 加载列表:40 LOAD_FAST
  • Load the attribute: 43 LOAD_ATTRIBUTE 加载属性:43 LOAD_ATTRIBUTE
  • Call the loaded function: 49 CALL_FUNCTION 调用加载的函数:49 CALL_FUNCTION
  • Unload the list(?): 52 POP_TOP 卸载列表(?):52 POP_TOP

As you can see from the output below the previous bytecode are missing with list comprehension and with the "loop_fast" function. 正如您从下面的输出中看到的那样,列表理解和“loop_fast”函数缺少前一个字节码。 Comparing the timeit of the three function is clear that those are responsible for the different timing of the three methods. 比较三个函数的时间显然是那些负责三种方法的不同时间。

In [27]: dis.dis(iter)
  2          0 BUILD_LIST             0
             3 LOAD_GLOBAL            0 (range)
             6 LOAD_CONST             1 (1)
             9 LOAD_CONST             2 (100)
            12 CALL_FUNCTION          2
            15 GET_ITER
       >>   16 FOR_ITER              24 (to 43)
            19 STORE_FAST             0 (x)
            22 LOAD_FAST              0 (x)
            25 LOAD_CONST             2 (100)
            28 COMPARE_OP             4 (>)
            31 POP_JUMP_IF_FALSE     16
            34 LOAD_FAST              0 (x)
            37 LIST_APPEND            2
            40 JUMP_ABSOLUTE         16
       >>   43 STORE_FAST             1 (l)
            46 LOAD_CONST             0 (None)
            49 RETURN_VALUE

In [28]: dis.dis(loop)
  2          0 BUILD_LIST             0
             3 STORE_FAST             0 (1)

  3          6 SETUP_LOOP            51 (to 60)
             9 LOAD_GLOBAL            0 (range)
            12 LOAD_CONST             1 (1)
            15 LOAD_CONST             2 (100)
            18 CALL_FUNCTION          2
            21 GET_ITER
       >>   22 FOR_ITER              34 (to 59)
            25 STORE_FAST             1 (x)

  4         28 LOAD_FAST              1 (x)
            31 LOAD_CONST             3 (10)
            34 COMPARE_OP             4 (>)
            37 POP_JUMP_IF_FALSE     22

  5         40 LOAD_FAST              0 (l)
            43 LOAD_ATTR              1 (append)
            46 LOAD_FAST              1 (x)
            49 CALL_FUNCTION          1
            52 POP_TOP
            53 JUMP_ABSOLUTE         22
            56 JUMP_ABSOLUTE         22
       >>   59 POP_BLOCK
       >>   60 LOAD_CONST             0 (None)
            63 RETURN_VALUE

In [29]: dis.dis(loop_fast)
  2          0 BUILD_LIST             0
             3 STORE_FAST             0 (1)

  3          6 SETUP_LOOP            38 (to 47)
             9 LOAD_GLOBAL            0 (range)
            12 LOAD_CONST             1 (1)
            15 LOAD_CONST             2 (100)
            18 CALL_FUNCTION          2
            21 GET_ITER
       >>   22 FOR_ITER              21 (to 46)
            25 STORE_FAST             1 (x)

  4         28 LOAD_FAST              1 (x)
            31 LOAD_CONST             3 (10)
            34 COMPARE_OP             4 (>)
            37 POP_JUMP_IF_FALSE     22

  5         40 JUMP_ABSOLUTE         22
            43 JUMP_ABSOLUTE         22
       >>   46 POP_BLOCK
       >>   47 LOAD_CONST             0 (None)
            50 RETURN_VALUE

Let's dissipate that doubt : The second version is slightly faster : list comprehension are faster , yet two arrays looping and as much conditionals are discarded in one iteration. 让我们消除这个疑问: 第二个版本稍快一些: 列表理解更快 ,但是在一次迭代中丢弃了两个数组循环和多个条件。

def kth_order_statistic1(array,k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = [x for x in array if x < pivot]
    m = [x for x in array if x == pivot]
    r = [x for x in array if x > pivot]

    if k <= len(l):
        return kth_order_statistic1(l, k)
    elif k > len(l) + len(m):
        return kth_order_statistic1(r, k - len(l) - len(m))
    else:
        return m[0]


def kth_order_statistic2(array,k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = []
    m = []
    r = []
    for x in array:
        if x < pivot:
            l.append(x)
        elif x > pivot:
            r.append(x)
        else:
            m.append(x)

    if k <= len(l):
        return kth_order_statistic2(l, k)
    elif k > len(l) + len(m):
        return kth_order_statistic2(r, k - len(l) - len(m))
    else:
        return m[0]

def kth_order_statistic3(array,k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = []
    m = []
    r = []

    for x in array: 
       if x < pivot: l.append(x)
    for x in array: 
       if x== pivot: m.append(x)
    for x in array: 
       if x > pivot: r.append(x)

    if k <= len(l):
        return kth_order_statistic3(l, k)
    elif k > len(l) + len(m):
        return kth_order_statistic3(r, k - len(l) - len(m))
    else:
        return m[0]

import time
import random
if __name__ == '__main__':

    A = range(100000)
    random.shuffle(A)
    k = random.randint(1, len(A)-1)

    start_time = time.time()
    for x in range(1000) :
        kth_order_statistic1(A,k)
    print("--- %s seconds ---" % (time.time() - start_time))

    start_time = time.time()
    for x in range(1000) :
        kth_order_statistic2(A,k)
    print("--- %s seconds ---" % (time.time() - start_time))

    start_time = time.time()
    for x in range(1000) :
        kth_order_statistic3(A,k)
    print("--- %s seconds ---" % (time.time() - start_time))


python :
--- 25.8894710541 seconds ---
--- 24.073086977 seconds ---
--- 32.9823839664 seconds ---

ipython
--- 25.7450709343 seconds ---
--- 22.7140650749 seconds ---
--- 35.2958850861 seconds ---

The timing may vary according to the random draw, but the differences between the three are pretty much the same. 时间可能会根据随机抽取而有所不同,但三者之间的差异几乎相同。

The algorithmic structure differs and the conditional structure is to be incriminated. 算法结构不同,条件结构也是有罪的。 the test to append into r and m can be discarded by the previous test. 可以通过之前的测试丢弃附加到r和m中的测试。 A more strict comparison regarding a for loop with append , and list comprehension would be against the non-optimal following 关于具有append的for循环和列表理解的更严格的比较将反对非最优跟随

for x in array:
        if x < pivot:
            l.append(x)
for x in array:
        if x== pivot:
            m.append(x)
for x in array:
        if x > pivot:
            r.append(x)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM