简体   繁体   中英

Mutable iterators in Python?

While benchmarking my application, I noticed accessing arrays items by index in relatively expensive in Python, making for v in lst: v signicantly faster than for i in range(len(lst): lst[i] :

from array import array

a_ = array('f', range(1000))

def f1():
    a = a_
    acc = 0
    for v in a:
        acc += v

    return acc

def f2():
    a = a_
    acc = 0
    for i in range(len(a)):
        acc += a[i]

    return acc

from dis import dis
from timeit import timeit
for f in f1,f2:
    dis(f)
    print(timeit(f, number=20000))
    print()

Producing:

  9           0 LOAD_GLOBAL              0 (a_)
              3 STORE_FAST               0 (a)

 10           6 LOAD_CONST               1 (0)
              9 STORE_FAST               1 (acc)

 11          12 SETUP_LOOP              24 (to 39)
             15 LOAD_FAST                0 (a)
             18 GET_ITER
        >>   19 FOR_ITER                16 (to 38)
             22 STORE_FAST               2 (v)

 12          25 LOAD_FAST                1 (acc)
             28 LOAD_FAST                2 (v)
             31 INPLACE_ADD
             32 STORE_FAST               1 (acc)
             35 JUMP_ABSOLUTE           19
        >>   38 POP_BLOCK

 14     >>   39 LOAD_FAST                1 (acc)
             42 RETURN_VALUE
0.6036834940023255

 17           0 LOAD_GLOBAL              0 (a_)
              3 STORE_FAST               0 (a)

 18           6 LOAD_CONST               1 (0)
              9 STORE_FAST               1 (acc)

 19          12 SETUP_LOOP              40 (to 55)
             15 LOAD_GLOBAL              1 (range)
             18 LOAD_GLOBAL              2 (len)
             21 LOAD_FAST                0 (a)
             24 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             27 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             30 GET_ITER
        >>   31 FOR_ITER                20 (to 54)
             34 STORE_FAST               2 (i)

 20          37 LOAD_FAST                1 (acc)
             40 LOAD_FAST                0 (a)
             43 LOAD_FAST                2 (i)
             46 BINARY_SUBSCR
             47 INPLACE_ADD
             48 STORE_FAST               1 (acc)
             51 JUMP_ABSOLUTE           31
        >>   54 POP_BLOCK

 22     >>   55 LOAD_FAST                1 (acc)
             58 RETURN_VALUE
1.0093544629999087

The core of the loop differs only in the presence of the extra LOAD_FAST BINARY_SUBSCR opcodes when using indexed access. However, this suffices for the iterator-based solution to be about 40% faster than using indexed access.

Unfortunately, in this form, iterators are only usable for reading the input array. Is there a way to use a "fast" iterator to change in place the item of an array, or do I have to stick to the "slow" indexed access?

For full loops, you can split the difference with enumerate , using indexed access for setting a value, and the name for reading it:

for i, value in enumerate(mysequence):
    mysequence[i] = do_stuff_with(value)

You can't avoid the indexed reassignment in a general loop structure though; Python has no equivalent to C++ reference semantics where assignment changes the value referred to, rather than rebinding the name.

That said, if the work is simple enough, a list comprehension avoids the need for an index, by building the new list and wholesale replacing the old one:

mysequence[:] = [do_stuff_with(value) for value in mysequence]

Assigning to the complete slice of mysequence ensures it's modified in place, so other references to it see the change. You can leave off the [:] if you don't want that behavior (you'll rebind to a new list with no other references to it).

Here are some timeit results for the different methods:

     #for v in a       #for i in range(len(a))  #for i,v in enumerate(a)
[[0.47590930000296794,   0.8639191000038409,    0.7616558000008808],
 [0.43640120000054594,   0.832395199999155,     0.7896779000002425],
 [0.44416509999427944,   0.8366088000038872,    0.7590674000020954]]

Note that using numpy arrays is very fast, but only if you build the array inside numpy and only use the native numpy functions:

import numpy as np
def f4():
    N = 1000
    vect = np.arange(float(N))
    return np.sum(vect)

timeit gives:

[0.09995190000336152
 0.10408379999716999
 0.09926139999879524]

Attempting to modify a numpy array by explicit indexing seems to be give no advantage, however. Also copying any native Python structure into a numpy array is expensive.

As a complement to @ShadowRunner's answer, I've done some extra benchmarking to compare the different solutions to modify an array in place.

Even for relatively large arrays, it's cheaper to build a new array from a comprehension list, then overwriting the original array, rather than trying to modify in its turn the individual items of the array.

import itertools
from functools import reduce
import operator
from array import array

a_ = array('f', range(100000))
b_ = array('f', range(100000))

def f1():
    a = a_
    b = b_
    for i,v in enumerate(a):
        b[i] = v*2

def f2():
    a = a_
    b = b_
    for i in range(len(a)):
        b[i] = a[i]*2

def f3():
    a = a_
    b = b_
    getter = a.__getitem__
    setter = b.__setitem__
    for i in range(len(a)):
        setter(i, getter(i)*2)

def f4():
    a = a_
    b = b_
    b[:] = array('f', [v*2 for v in a])

from dis import dis
from timeit import timeit

for f in f1,f2,f3,f4:
    dis(f)
    print(timeit(f, number=2000))
    print()

Result:

 10           0 LOAD_GLOBAL              0 (a_)
              3 STORE_FAST               0 (a)

 11           6 LOAD_GLOBAL              1 (b_)
              9 STORE_FAST               1 (b)

 12          12 SETUP_LOOP              40 (to 55)
             15 LOAD_GLOBAL              2 (enumerate)
             18 LOAD_FAST                0 (a)
             21 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             24 GET_ITER
        >>   25 FOR_ITER                26 (to 54)
             28 UNPACK_SEQUENCE          2
             31 STORE_FAST               2 (i)
             34 STORE_FAST               3 (v)

 13          37 LOAD_FAST                3 (v)
             40 LOAD_CONST               1 (2)
             43 BINARY_MULTIPLY
             44 LOAD_FAST                1 (b)
             47 LOAD_FAST                2 (i)
             50 STORE_SUBSCR
             51 JUMP_ABSOLUTE           25
        >>   54 POP_BLOCK
        >>   55 LOAD_CONST               0 (None)
             58 RETURN_VALUE
24.717656177999743

 16           0 LOAD_GLOBAL              0 (a_)
              3 STORE_FAST               0 (a)

 17           6 LOAD_GLOBAL              1 (b_)
              9 STORE_FAST               1 (b)

 18          12 SETUP_LOOP              44 (to 59)
             15 LOAD_GLOBAL              2 (range)
             18 LOAD_GLOBAL              3 (len)
             21 LOAD_FAST                0 (a)
             24 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             27 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             30 GET_ITER
        >>   31 FOR_ITER                24 (to 58)
             34 STORE_FAST               2 (i)

 19          37 LOAD_FAST                0 (a)
             40 LOAD_FAST                2 (i)
             43 BINARY_SUBSCR
             44 LOAD_CONST               1 (2)
             47 BINARY_MULTIPLY
             48 LOAD_FAST                1 (b)
             51 LOAD_FAST                2 (i)
             54 STORE_SUBSCR
             55 JUMP_ABSOLUTE           31
        >>   58 POP_BLOCK
        >>   59 LOAD_CONST               0 (None)
             62 RETURN_VALUE
25.86994492100348

 22           0 LOAD_GLOBAL              0 (a_)
              3 STORE_FAST               0 (a)

 23           6 LOAD_GLOBAL              1 (b_)
              9 STORE_FAST               1 (b)

 24          12 LOAD_FAST                0 (a)
             15 LOAD_ATTR                2 (__getitem__)
             18 STORE_FAST               2 (getter)

 25          21 LOAD_FAST                1 (b)
             24 LOAD_ATTR                3 (__setitem__)
             27 STORE_FAST               3 (setter)

 26          30 SETUP_LOOP              49 (to 82)
             33 LOAD_GLOBAL              4 (range)
             36 LOAD_GLOBAL              5 (len)
             39 LOAD_FAST                0 (a)
             42 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             45 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             48 GET_ITER
        >>   49 FOR_ITER                29 (to 81)
             52 STORE_FAST               4 (i)

 27          55 LOAD_FAST                3 (setter)
             58 LOAD_FAST                4 (i)
             61 LOAD_FAST                2 (getter)
             64 LOAD_FAST                4 (i)
             67 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             70 LOAD_CONST               1 (2)
             73 BINARY_MULTIPLY
             74 CALL_FUNCTION            2 (2 positional, 0 keyword pair)
             77 POP_TOP
             78 JUMP_ABSOLUTE           49
        >>   81 POP_BLOCK
        >>   82 LOAD_CONST               0 (None)
             85 RETURN_VALUE
42.435717200998624

 30           0 LOAD_GLOBAL              0 (a_)
              3 STORE_FAST               0 (a)

 31           6 LOAD_GLOBAL              1 (b_)
              9 STORE_FAST               1 (b)

 32          12 LOAD_GLOBAL              2 (array)
             15 LOAD_CONST               1 ('f')
             18 LOAD_CONST               2 (<code object <listcomp> at 0x7fab3b1e9ae0, file "t.py", line 32>)
             21 LOAD_CONST               3 ('f4.<locals>.<listcomp>')
             24 MAKE_FUNCTION            0
             27 LOAD_FAST                0 (a)
             30 GET_ITER
             31 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             34 CALL_FUNCTION            2 (2 positional, 0 keyword pair)
             37 LOAD_FAST                1 (b)
             40 LOAD_CONST               0 (None)
             43 LOAD_CONST               0 (None)
             46 BUILD_SLICE              2
             49 STORE_SUBSCR
             50 LOAD_CONST               0 (None)
             53 RETURN_VALUE
19.328940125000372

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM