[英]Mutable iterators in Python?
在對我的應用程序進行基准測試時,我注意到在 Python 中按索引訪問數組項相對昂貴,這使得for v in lst: v
明顯快於for i in range(len(lst): lst[i]
:
from array import array
a_ = array('f', range(1000))
def f1():
a = a_
acc = 0
for v in a:
acc += v
return acc
def f2():
a = a_
acc = 0
for i in range(len(a)):
acc += a[i]
return acc
from dis import dis
from timeit import timeit
for f in f1,f2:
dis(f)
print(timeit(f, number=20000))
print()
生產:
9 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
10 6 LOAD_CONST 1 (0)
9 STORE_FAST 1 (acc)
11 12 SETUP_LOOP 24 (to 39)
15 LOAD_FAST 0 (a)
18 GET_ITER
>> 19 FOR_ITER 16 (to 38)
22 STORE_FAST 2 (v)
12 25 LOAD_FAST 1 (acc)
28 LOAD_FAST 2 (v)
31 INPLACE_ADD
32 STORE_FAST 1 (acc)
35 JUMP_ABSOLUTE 19
>> 38 POP_BLOCK
14 >> 39 LOAD_FAST 1 (acc)
42 RETURN_VALUE
0.6036834940023255
17 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
18 6 LOAD_CONST 1 (0)
9 STORE_FAST 1 (acc)
19 12 SETUP_LOOP 40 (to 55)
15 LOAD_GLOBAL 1 (range)
18 LOAD_GLOBAL 2 (len)
21 LOAD_FAST 0 (a)
24 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
27 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
30 GET_ITER
>> 31 FOR_ITER 20 (to 54)
34 STORE_FAST 2 (i)
20 37 LOAD_FAST 1 (acc)
40 LOAD_FAST 0 (a)
43 LOAD_FAST 2 (i)
46 BINARY_SUBSCR
47 INPLACE_ADD
48 STORE_FAST 1 (acc)
51 JUMP_ABSOLUTE 31
>> 54 POP_BLOCK
22 >> 55 LOAD_FAST 1 (acc)
58 RETURN_VALUE
1.0093544629999087
循環的核心僅在使用索引訪問時存在額外的LOAD_FAST
BINARY_SUBSCR
操作碼時有所不同。 然而,這足以讓基於迭代器的解決方案比使用索引訪問快 40%。
不幸的是,在這種形式中,迭代器只能用於讀取輸入數組。 有沒有辦法使用一個“快”迭代到位數組的項目變更的方式,或者說我要堅持到“慢”索引訪問?
對於完整循環,您可以使用enumerate
分割差異,使用索引訪問來設置值,以及讀取它的名稱:
for i, value in enumerate(mysequence):
mysequence[i] = do_stuff_with(value)
但是,您無法避免通用循環結構中的索引重新分配; Python 沒有等效於 C++ 引用語義,其中賦值更改引用的值,而不是重新綁定名稱。
也就是說,如果工作足夠簡單, list
推導式通過構建新list
並大量替換舊list
避免對索引的需要:
mysequence[:] = [do_stuff_with(value) for value in mysequence]
分配給mysequence
的完整切片確保它被修改到位,因此對它的其他引用會看到更改。 如果您不想要這種行為,您可以不使用[:]
(您將重新綁定到一個沒有其他引用的新list
)。
以下是不同方法的一些時間結果:
#for v in a #for i in range(len(a)) #for i,v in enumerate(a)
[[0.47590930000296794, 0.8639191000038409, 0.7616558000008808],
[0.43640120000054594, 0.832395199999155, 0.7896779000002425],
[0.44416509999427944, 0.8366088000038872, 0.7590674000020954]]
請注意,使用 numpy 數組非常快,但前提是您在 numpy 內部構建數組並且僅使用本機 numpy 函數:
import numpy as np
def f4():
N = 1000
vect = np.arange(float(N))
return np.sum(vect)
時間給出:
[0.09995190000336152
0.10408379999716999
0.09926139999879524]
然而,嘗試通過顯式索引修改 numpy 數組似乎沒有任何好處。 將任何原生 Python 結構復制到 numpy 數組中也是很昂貴的。
作為對@ShadowRunner 答案的補充,我做了一些額外的基准測試來比較不同的解決方案來修改數組。
即使對於相對較大的數組,從推導式列表構建新數組,然后覆蓋原始數組,而不是嘗試依次修改數組的各個項目,成本更低。
import itertools
from functools import reduce
import operator
from array import array
a_ = array('f', range(100000))
b_ = array('f', range(100000))
def f1():
a = a_
b = b_
for i,v in enumerate(a):
b[i] = v*2
def f2():
a = a_
b = b_
for i in range(len(a)):
b[i] = a[i]*2
def f3():
a = a_
b = b_
getter = a.__getitem__
setter = b.__setitem__
for i in range(len(a)):
setter(i, getter(i)*2)
def f4():
a = a_
b = b_
b[:] = array('f', [v*2 for v in a])
from dis import dis
from timeit import timeit
for f in f1,f2,f3,f4:
dis(f)
print(timeit(f, number=2000))
print()
結果:
10 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
11 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
12 12 SETUP_LOOP 40 (to 55)
15 LOAD_GLOBAL 2 (enumerate)
18 LOAD_FAST 0 (a)
21 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
24 GET_ITER
>> 25 FOR_ITER 26 (to 54)
28 UNPACK_SEQUENCE 2
31 STORE_FAST 2 (i)
34 STORE_FAST 3 (v)
13 37 LOAD_FAST 3 (v)
40 LOAD_CONST 1 (2)
43 BINARY_MULTIPLY
44 LOAD_FAST 1 (b)
47 LOAD_FAST 2 (i)
50 STORE_SUBSCR
51 JUMP_ABSOLUTE 25
>> 54 POP_BLOCK
>> 55 LOAD_CONST 0 (None)
58 RETURN_VALUE
24.717656177999743
16 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
17 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
18 12 SETUP_LOOP 44 (to 59)
15 LOAD_GLOBAL 2 (range)
18 LOAD_GLOBAL 3 (len)
21 LOAD_FAST 0 (a)
24 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
27 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
30 GET_ITER
>> 31 FOR_ITER 24 (to 58)
34 STORE_FAST 2 (i)
19 37 LOAD_FAST 0 (a)
40 LOAD_FAST 2 (i)
43 BINARY_SUBSCR
44 LOAD_CONST 1 (2)
47 BINARY_MULTIPLY
48 LOAD_FAST 1 (b)
51 LOAD_FAST 2 (i)
54 STORE_SUBSCR
55 JUMP_ABSOLUTE 31
>> 58 POP_BLOCK
>> 59 LOAD_CONST 0 (None)
62 RETURN_VALUE
25.86994492100348
22 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
23 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
24 12 LOAD_FAST 0 (a)
15 LOAD_ATTR 2 (__getitem__)
18 STORE_FAST 2 (getter)
25 21 LOAD_FAST 1 (b)
24 LOAD_ATTR 3 (__setitem__)
27 STORE_FAST 3 (setter)
26 30 SETUP_LOOP 49 (to 82)
33 LOAD_GLOBAL 4 (range)
36 LOAD_GLOBAL 5 (len)
39 LOAD_FAST 0 (a)
42 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
45 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
48 GET_ITER
>> 49 FOR_ITER 29 (to 81)
52 STORE_FAST 4 (i)
27 55 LOAD_FAST 3 (setter)
58 LOAD_FAST 4 (i)
61 LOAD_FAST 2 (getter)
64 LOAD_FAST 4 (i)
67 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
70 LOAD_CONST 1 (2)
73 BINARY_MULTIPLY
74 CALL_FUNCTION 2 (2 positional, 0 keyword pair)
77 POP_TOP
78 JUMP_ABSOLUTE 49
>> 81 POP_BLOCK
>> 82 LOAD_CONST 0 (None)
85 RETURN_VALUE
42.435717200998624
30 0 LOAD_GLOBAL 0 (a_)
3 STORE_FAST 0 (a)
31 6 LOAD_GLOBAL 1 (b_)
9 STORE_FAST 1 (b)
32 12 LOAD_GLOBAL 2 (array)
15 LOAD_CONST 1 ('f')
18 LOAD_CONST 2 (<code object <listcomp> at 0x7fab3b1e9ae0, file "t.py", line 32>)
21 LOAD_CONST 3 ('f4.<locals>.<listcomp>')
24 MAKE_FUNCTION 0
27 LOAD_FAST 0 (a)
30 GET_ITER
31 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
34 CALL_FUNCTION 2 (2 positional, 0 keyword pair)
37 LOAD_FAST 1 (b)
40 LOAD_CONST 0 (None)
43 LOAD_CONST 0 (None)
46 BUILD_SLICE 2
49 STORE_SUBSCR
50 LOAD_CONST 0 (None)
53 RETURN_VALUE
19.328940125000372
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.