Comparing list comprehensions and explicit loops (3 array generators faster than 1 for loop)

Question

I did homework and I accidentally found a strange inconsistency in the speed of the algorithm. Here is 2 versions of code of same function bur with 1 difference: in first version i use 3 times array generator to filter some array and in second version i use 1 for loop with 3 if statements to do same filter work.

So, here is code of 1st version:

def kth_order_statistic(array, k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = [x for x in array if x < pivot]
    m = [x for x in array if x == pivot]
    r = [x for x in array if x > pivot]
    if k <= len(l):
            return kth_order_statistic(l, k)
    elif k > len(l) + len(m):
            return kth_order_statistic(r, k - len(l) - len(m))
    else:
            return m[0]

And here code of 2nd version:

def kth_order_statistic2(array, k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = []
    m = []
    r = []
    for x in array:
        if x < pivot:
            l.append(x)
        elif x > pivot:
            r.append(x)
        else:
            m.append(x)

    if k <= len(l):
        return kth_order_statistic2(l, k)
    elif k > len(l) + len(m):
        return kth_order_statistic2(r, k - len(l) - len(m))
    else:
        return m[0]

IPython output for 1st version:

In [4]: %%timeit
   ...: A = range(100000)
   ...: shuffle(A)
   ...: k = randint(1, len(A)-1)
   ...: order_statisctic(A, k)
   ...:
10 loops, best of 3: 120 ms per loop

And for 2nd version:

In [5]: %%timeit
   ...: A = range(100000)
   ...: shuffle(A)
   ...: k = randint(1, len(A)-1)
   ...: kth_order_statistic2(A, k)
   ...:
10 loops, best of 3: 169 ms per loop

So why first version is faster than second? I also made third version wich using filter() function instead of array generator and it was slower than second version (it got 218 ms per loop)

Answer 1

Using simple for is faster than list comprehesion . It is almost 2 times faster. Check below results:

Using list comprehension : 58 usec

moin@moin-pc:~$ python -m timeit "[i for i in range(1000)]"
10000 loops, best of 3: 58 usec per loop

Using for loop: 37.1 usec

moin@moin-pc:~$ python -m timeit "for i in range(1000): i"
10000 loops, best of 3: 37.1 usec per loop

But in your case, for is taking more time than list comprehension not because YOUR for loop is slow. But because of .append() you are using within the code.

With append() in for loop`: 114 usec

moin@moin-pc:~$ python -m timeit "my_list = []" "for i in range(1000): my_list.append(i)"
10000 loops, best of 3: 114 usec per loop

Which clearly shows that it is .append() which is taking twice the time taken by for loop .

However, on storing the "list.append" in different variable : 69.3 usec

moin@moin-pc:~$ python -m timeit "my_list = []; append = my_list.append" "for i in range(1000): append(i)"
10000 loops, best of 3: 69.3 usec per loop

There is a huge improvement in the performance as compared to the last case in above comparisons, and result is quite comparable to that of list comprehension . That means, instead of calling my_list.append() each time, the performance can be improved by storing the reference of function in another variable ie append_func = my_list.append and making a call using that variable append_func(i) .

Which also proves, it is faster to call class's function stored in the variable as compared to directly making the function call using object of the class .

Thank You Stefan for bringing the last case in notice.

Answer 2

Let's define the functions we will need to answer the question and timeit them:

In [18]: def iter():
    l = [x for x in range(100) if x > 10]
   ....:

In [19]: %timeit iter()
100000 loops, best of 3: 7.92 µs per loop

In [20]: def loop():
    l = []
    for x in range(100):
        if x > 10:
            l.append(x)
   ....:

In [21]: %timeit loop()
10000 loops, best of 3: 20 µs per loop

In [22]: def loop_fast():
    l = []
    for x in range(100):
        if x > 10:
            pass
   ....:

In [23]: %timeit loop_fast()
100000 loops, best of 3: 4.69 µs per loop

we can see that the for loops without the append command is as fast as the list comprehension. In fact, if we have a look at the bytecode we can see that in the case of the list comprehension python is able to use a built-in bytecode command called LIST_APPEND instead of:

Load the list: 40 LOAD_FAST
Load the attribute: 43 LOAD_ATTRIBUTE
Call the loaded function: 49 CALL_FUNCTION
Unload the list(?): 52 POP_TOP

As you can see from the output below the previous bytecode are missing with list comprehension and with the "loop_fast" function. Comparing the timeit of the three function is clear that those are responsible for the different timing of the three methods.

In [27]: dis.dis(iter)
  2          0 BUILD_LIST             0
             3 LOAD_GLOBAL            0 (range)
             6 LOAD_CONST             1 (1)
             9 LOAD_CONST             2 (100)
            12 CALL_FUNCTION          2
            15 GET_ITER
       >>   16 FOR_ITER              24 (to 43)
            19 STORE_FAST             0 (x)
            22 LOAD_FAST              0 (x)
            25 LOAD_CONST             2 (100)
            28 COMPARE_OP             4 (>)
            31 POP_JUMP_IF_FALSE     16
            34 LOAD_FAST              0 (x)
            37 LIST_APPEND            2
            40 JUMP_ABSOLUTE         16
       >>   43 STORE_FAST             1 (l)
            46 LOAD_CONST             0 (None)
            49 RETURN_VALUE

In [28]: dis.dis(loop)
  2          0 BUILD_LIST             0
             3 STORE_FAST             0 (1)

  3          6 SETUP_LOOP            51 (to 60)
             9 LOAD_GLOBAL            0 (range)
            12 LOAD_CONST             1 (1)
            15 LOAD_CONST             2 (100)
            18 CALL_FUNCTION          2
            21 GET_ITER
       >>   22 FOR_ITER              34 (to 59)
            25 STORE_FAST             1 (x)

  4         28 LOAD_FAST              1 (x)
            31 LOAD_CONST             3 (10)
            34 COMPARE_OP             4 (>)
            37 POP_JUMP_IF_FALSE     22

  5         40 LOAD_FAST              0 (l)
            43 LOAD_ATTR              1 (append)
            46 LOAD_FAST              1 (x)
            49 CALL_FUNCTION          1
            52 POP_TOP
            53 JUMP_ABSOLUTE         22
            56 JUMP_ABSOLUTE         22
       >>   59 POP_BLOCK
       >>   60 LOAD_CONST             0 (None)
            63 RETURN_VALUE

In [29]: dis.dis(loop_fast)
  2          0 BUILD_LIST             0
             3 STORE_FAST             0 (1)

  3          6 SETUP_LOOP            38 (to 47)
             9 LOAD_GLOBAL            0 (range)
            12 LOAD_CONST             1 (1)
            15 LOAD_CONST             2 (100)
            18 CALL_FUNCTION          2
            21 GET_ITER
       >>   22 FOR_ITER              21 (to 46)
            25 STORE_FAST             1 (x)

  4         28 LOAD_FAST              1 (x)
            31 LOAD_CONST             3 (10)
            34 COMPARE_OP             4 (>)
            37 POP_JUMP_IF_FALSE     22

  5         40 JUMP_ABSOLUTE         22
            43 JUMP_ABSOLUTE         22
       >>   46 POP_BLOCK
       >>   47 LOAD_CONST             0 (None)
            50 RETURN_VALUE

Answer 3

Let's dissipate that doubt : The second version is slightly faster : list comprehension are faster , yet two arrays looping and as much conditionals are discarded in one iteration.

def kth_order_statistic1(array,k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = [x for x in array if x < pivot]
    m = [x for x in array if x == pivot]
    r = [x for x in array if x > pivot]

    if k <= len(l):
        return kth_order_statistic1(l, k)
    elif k > len(l) + len(m):
        return kth_order_statistic1(r, k - len(l) - len(m))
    else:
        return m[0]


def kth_order_statistic2(array,k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = []
    m = []
    r = []
    for x in array:
        if x < pivot:
            l.append(x)
        elif x > pivot:
            r.append(x)
        else:
            m.append(x)

    if k <= len(l):
        return kth_order_statistic2(l, k)
    elif k > len(l) + len(m):
        return kth_order_statistic2(r, k - len(l) - len(m))
    else:
        return m[0]

def kth_order_statistic3(array,k):
    pivot = (array[0] + array[len(array) - 1]) // 2
    l = []
    m = []
    r = []

    for x in array: 
       if x < pivot: l.append(x)
    for x in array: 
       if x== pivot: m.append(x)
    for x in array: 
       if x > pivot: r.append(x)

    if k <= len(l):
        return kth_order_statistic3(l, k)
    elif k > len(l) + len(m):
        return kth_order_statistic3(r, k - len(l) - len(m))
    else:
        return m[0]

import time
import random
if __name__ == '__main__':

    A = range(100000)
    random.shuffle(A)
    k = random.randint(1, len(A)-1)

    start_time = time.time()
    for x in range(1000) :
        kth_order_statistic1(A,k)
    print("--- %s seconds ---" % (time.time() - start_time))

    start_time = time.time()
    for x in range(1000) :
        kth_order_statistic2(A,k)
    print("--- %s seconds ---" % (time.time() - start_time))

    start_time = time.time()
    for x in range(1000) :
        kth_order_statistic3(A,k)
    print("--- %s seconds ---" % (time.time() - start_time))

python :
--- 25.8894710541 seconds ---
--- 24.073086977 seconds ---
--- 32.9823839664 seconds ---

ipython
--- 25.7450709343 seconds ---
--- 22.7140650749 seconds ---
--- 35.2958850861 seconds ---

The timing may vary according to the random draw, but the differences between the three are pretty much the same.

Answer 4

The algorithmic structure differs and the conditional structure is to be incriminated. the test to append into r and m can be discarded by the previous test. A more strict comparison regarding a for loop with append , and list comprehension would be against the non-optimal following

for x in array:
        if x < pivot:
            l.append(x)
for x in array:
        if x== pivot:
            m.append(x)
for x in array:
        if x > pivot:
            r.append(x)

Comparing list comprehensions and explicit loops (3 array generators faster than 1 for loop)

Question

4 answers

solution1
8 2016-09-15 19:32:59

solution2
6 ACCPTED 2016-09-15 20:17:32

solution3
3 2016-09-15 20:59:53

solution4
2 2016-09-15 19:46:24

Comparing list comprehensions and explicit loops (3 array generators faster than 1 for loop)

Question

4 answers

solution1 8 2016-09-15 19:32:59

solution2 6 ACCPTED 2016-09-15 20:17:32

solution3 3 2016-09-15 20:59:53

solution4 2 2016-09-15 19:46:24

solution1
8 2016-09-15 19:32:59

solution2
6 ACCPTED 2016-09-15 20:17:32

solution3
3 2016-09-15 20:59:53

solution4
2 2016-09-15 19:46:24