While benchmarking my application, I noticed accessing arrays items by index in relatively expensive in Python, making for v in lst: v
signicantly faster than for i in range(len(lst): lst[i]
:
from array import array
a_ = array('f', range(1000))
def f1():
a = a_
acc = 0
for v in a:
acc += v
return acc
def f2():
a = a_
acc = 0
for i in range(len(a)):
acc += a[i]
return acc
from dis import dis
from timeit import timeit
for f in f1,f2:
dis(f)
print(timeit(f, number=20000))
print()
Producing:
9 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
10 6 LOAD_CONST 1 (0)
9 STORE_FAST 1 (acc)
11 12 SETUP_LOOP 24 (to 39)
15 LOAD_FAST 0 (a)
18 GET_ITER
>> 19 FOR_ITER 16 (to 38)
22 STORE_FAST 2 (v)
12 25 LOAD_FAST 1 (acc)
28 LOAD_FAST 2 (v)
31 INPLACE_ADD
32 STORE_FAST 1 (acc)
35 JUMP_ABSOLUTE 19
>> 38 POP_BLOCK
14 >> 39 LOAD_FAST 1 (acc)
42 RETURN_VALUE
0.6036834940023255
17 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
18 6 LOAD_CONST 1 (0)
9 STORE_FAST 1 (acc)
19 12 SETUP_LOOP 40 (to 55)
15 LOAD_GLOBAL 1 (range)
18 LOAD_GLOBAL 2 (len)
21 LOAD_FAST 0 (a)
24 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
27 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
30 GET_ITER
>> 31 FOR_ITER 20 (to 54)
34 STORE_FAST 2 (i)
20 37 LOAD_FAST 1 (acc)
40 LOAD_FAST 0 (a)
43 LOAD_FAST 2 (i)
46 BINARY_SUBSCR
47 INPLACE_ADD
48 STORE_FAST 1 (acc)
51 JUMP_ABSOLUTE 31
>> 54 POP_BLOCK
22 >> 55 LOAD_FAST 1 (acc)
58 RETURN_VALUE
1.0093544629999087
The core of the loop differs only in the presence of the extra LOAD_FAST
BINARY_SUBSCR
opcodes when using indexed access. However, this suffices for the iterator-based solution to be about 40% faster than using indexed access.
Unfortunately, in this form, iterators are only usable for reading the input array. Is there a way to use a "fast" iterator to change in place the item of an array, or do I have to stick to the "slow" indexed access?
For full loops, you can split the difference with enumerate
, using indexed access for setting a value, and the name for reading it:
for i, value in enumerate(mysequence):
mysequence[i] = do_stuff_with(value)
You can't avoid the indexed reassignment in a general loop structure though; Python has no equivalent to C++ reference semantics where assignment changes the value referred to, rather than rebinding the name.
That said, if the work is simple enough, a list
comprehension avoids the need for an index, by building the new list
and wholesale replacing the old one:
mysequence[:] = [do_stuff_with(value) for value in mysequence]
Assigning to the complete slice of mysequence
ensures it's modified in place, so other references to it see the change. You can leave off the [:]
if you don't want that behavior (you'll rebind to a new list
with no other references to it).
Here are some timeit results for the different methods:
#for v in a #for i in range(len(a)) #for i,v in enumerate(a)
[[0.47590930000296794, 0.8639191000038409, 0.7616558000008808],
[0.43640120000054594, 0.832395199999155, 0.7896779000002425],
[0.44416509999427944, 0.8366088000038872, 0.7590674000020954]]
Note that using numpy arrays is very fast, but only if you build the array inside numpy and only use the native numpy functions:
import numpy as np
def f4():
N = 1000
vect = np.arange(float(N))
return np.sum(vect)
timeit gives:
[0.09995190000336152
0.10408379999716999
0.09926139999879524]
Attempting to modify a numpy array by explicit indexing seems to be give no advantage, however. Also copying any native Python structure into a numpy array is expensive.
As a complement to @ShadowRunner's answer, I've done some extra benchmarking to compare the different solutions to modify an array in place.
Even for relatively large arrays, it's cheaper to build a new array from a comprehension list, then overwriting the original array, rather than trying to modify in its turn the individual items of the array.
import itertools
from functools import reduce
import operator
from array import array
a_ = array('f', range(100000))
b_ = array('f', range(100000))
def f1():
a = a_
b = b_
for i,v in enumerate(a):
b[i] = v*2
def f2():
a = a_
b = b_
for i in range(len(a)):
b[i] = a[i]*2
def f3():
a = a_
b = b_
getter = a.__getitem__
setter = b.__setitem__
for i in range(len(a)):
setter(i, getter(i)*2)
def f4():
a = a_
b = b_
b[:] = array('f', [v*2 for v in a])
from dis import dis
from timeit import timeit
for f in f1,f2,f3,f4:
dis(f)
print(timeit(f, number=2000))
print()
Result:
10 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
11 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
12 12 SETUP_LOOP 40 (to 55)
15 LOAD_GLOBAL 2 (enumerate)
18 LOAD_FAST 0 (a)
21 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
24 GET_ITER
>> 25 FOR_ITER 26 (to 54)
28 UNPACK_SEQUENCE 2
31 STORE_FAST 2 (i)
34 STORE_FAST 3 (v)
13 37 LOAD_FAST 3 (v)
40 LOAD_CONST 1 (2)
43 BINARY_MULTIPLY
44 LOAD_FAST 1 (b)
47 LOAD_FAST 2 (i)
50 STORE_SUBSCR
51 JUMP_ABSOLUTE 25
>> 54 POP_BLOCK
>> 55 LOAD_CONST 0 (None)
58 RETURN_VALUE
24.717656177999743
16 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
17 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
18 12 SETUP_LOOP 44 (to 59)
15 LOAD_GLOBAL 2 (range)
18 LOAD_GLOBAL 3 (len)
21 LOAD_FAST 0 (a)
24 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
27 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
30 GET_ITER
>> 31 FOR_ITER 24 (to 58)
34 STORE_FAST 2 (i)
19 37 LOAD_FAST 0 (a)
40 LOAD_FAST 2 (i)
43 BINARY_SUBSCR
44 LOAD_CONST 1 (2)
47 BINARY_MULTIPLY
48 LOAD_FAST 1 (b)
51 LOAD_FAST 2 (i)
54 STORE_SUBSCR
55 JUMP_ABSOLUTE 31
>> 58 POP_BLOCK
>> 59 LOAD_CONST 0 (None)
62 RETURN_VALUE
25.86994492100348
22 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
23 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
24 12 LOAD_FAST 0 (a)
15 LOAD_ATTR 2 (__getitem__)
18 STORE_FAST 2 (getter)
25 21 LOAD_FAST 1 (b)
24 LOAD_ATTR 3 (__setitem__)
27 STORE_FAST 3 (setter)
26 30 SETUP_LOOP 49 (to 82)
33 LOAD_GLOBAL 4 (range)
36 LOAD_GLOBAL 5 (len)
39 LOAD_FAST 0 (a)
42 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
45 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
48 GET_ITER
>> 49 FOR_ITER 29 (to 81)
52 STORE_FAST 4 (i)
27 55 LOAD_FAST 3 (setter)
58 LOAD_FAST 4 (i)
61 LOAD_FAST 2 (getter)
64 LOAD_FAST 4 (i)
67 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
70 LOAD_CONST 1 (2)
73 BINARY_MULTIPLY
74 CALL_FUNCTION 2 (2 positional, 0 keyword pair)
77 POP_TOP
78 JUMP_ABSOLUTE 49
>> 81 POP_BLOCK
>> 82 LOAD_CONST 0 (None)
85 RETURN_VALUE
42.435717200998624
30 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
31 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
32 12 LOAD_GLOBAL 2 (array)
15 LOAD_CONST 1 ('f')
18 LOAD_CONST 2 (<code object <listcomp> at 0x7fab3b1e9ae0, file "t.py", line 32>)
21 LOAD_CONST 3 ('f4.<locals>.<listcomp>')
24 MAKE_FUNCTION 0
27 LOAD_FAST 0 (a)
30 GET_ITER
31 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
34 CALL_FUNCTION 2 (2 positional, 0 keyword pair)
37 LOAD_FAST 1 (b)
40 LOAD_CONST 0 (None)
43 LOAD_CONST 0 (None)
46 BUILD_SLICE 2
49 STORE_SUBSCR
50 LOAD_CONST 0 (None)
53 RETURN_VALUE
19.328940125000372
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.