[英]Python: Efficient way to call inbuilt function multiple times?
I have a code that looks something like this: 我有一个看起来像这样的代码:
def somefunction(somelist):
for item in somelist:
if len(item) > 10:
do something
elif len(item) > 6:
do something
elif len(item) > 3:
do something
else:
do something
Since I am calling len(item) multiple times, is it inefficient to do it this way? 由于我多次调用len(item),以这种方式执行效率低下吗? Would it be preferable to write the code as follows, or are they EXACTLY the same in performance? 编写下面的代码还是更好,还是在性能上完全一样?
def somefunction(somelist):
for item in somelist:
x = len(item)
if x > 10:
do something
elif x > 6:
do something
elif x > 3:
do something
else:
do something
len() is O(1) operation. len()是O(1)运算。 This mean the cost of calling len( ) is very cheap. 这意味着调用len()的成本非常便宜。 So, stop worrying about it and better improve other part of your code. 因此,不必再为此担心,可以更好地改进代码的其他部分。
However, personally, I think the second way is better. 但是,就我个人而言,我认为第二种方法更好。 Because if I change your variable name from x
to length
, it will increase your code's readability. 因为如果我将变量名从x
更改为length
,它将增加代码的可读性。
def somefunction(somelist):
for item in somelist:
length = len(item)
if length > 10:
do something
elif length > 6:
do something
elif length > 3:
do something
else:
do something
NOTE: len( )
is O(1) with strings, sets, and dictionaries. 注意: len( )
是O(1),带有字符串,集合和字典。
The second approach is surely better, as the number of calls to len()
are reduced: 第二种方法肯定更好,因为减少了对len()
的调用次数:
In [16]: import dis
In [18]: lis=["a"*10000,"b"*10000,"c"*10000]*1000
In [19]: def first():
for item in lis:
if len(item)<100:
pass
elif 100<len(item)<200:
pass
elif 300<len(item)<400:
pass
....:
In [20]: def second():
for item in lis:
x=len(item)
if x<100:
pass
elif 100<x<200:
pass
elif 300<x<400:
pass
....:
You can always time your code using timeit
module: 您始终可以使用timeit
模块为代码计时:
In [21]: %timeit first()
100 loops, best of 3: 2.03 ms per loop
In [22]: %timeit second()
1000 loops, best of 3: 1.66 ms per loop
Use dis.dis()
to see disassembling of Python byte code into mnemonics 使用dis.dis()
查看将Python字节码反汇编为助记符
In [24]: dis.dis(first)
2 0 SETUP_LOOP 109 (to 112)
3 LOAD_GLOBAL 0 (lis)
6 GET_ITER
>> 7 FOR_ITER 101 (to 111)
10 STORE_FAST 0 (item)
3 13 LOAD_GLOBAL 1 (len)
16 LOAD_FAST 0 (item)
19 CALL_FUNCTION 1
22 LOAD_CONST 1 (100)
25 COMPARE_OP 0 (<)
28 POP_JUMP_IF_FALSE 34
4 31 JUMP_ABSOLUTE 7
5 >> 34 LOAD_CONST 1 (100)
37 LOAD_GLOBAL 1 (len)
40 LOAD_FAST 0 (item)
43 CALL_FUNCTION 1
46 DUP_TOP
47 ROT_THREE
48 COMPARE_OP 0 (<)
51 JUMP_IF_FALSE_OR_POP 63
54 LOAD_CONST 2 (200)
57 COMPARE_OP 0 (<)
60 JUMP_FORWARD 2 (to 65)
>> 63 ROT_TWO
64 POP_TOP
>> 65 POP_JUMP_IF_FALSE 71
6 68 JUMP_ABSOLUTE 7
7 >> 71 LOAD_CONST 3 (300)
74 LOAD_GLOBAL 1 (len)
77 LOAD_FAST 0 (item)
80 CALL_FUNCTION 1
83 DUP_TOP
84 ROT_THREE
85 COMPARE_OP 0 (<)
88 JUMP_IF_FALSE_OR_POP 100
91 LOAD_CONST 4 (400)
94 COMPARE_OP 0 (<)
97 JUMP_FORWARD 2 (to 102)
>> 100 ROT_TWO
101 POP_TOP
>> 102 POP_JUMP_IF_FALSE 7
8 105 JUMP_ABSOLUTE 7
108 JUMP_ABSOLUTE 7
>> 111 POP_BLOCK
>> 112 LOAD_CONST 0 (None)
115 RETURN_VALUE
In [25]: dis.dis(second)
2 0 SETUP_LOOP 103 (to 106)
3 LOAD_GLOBAL 0 (lis)
6 GET_ITER
>> 7 FOR_ITER 95 (to 105)
10 STORE_FAST 0 (item)
3 13 LOAD_GLOBAL 1 (len)
16 LOAD_FAST 0 (item)
19 CALL_FUNCTION 1
22 STORE_FAST 1 (x)
4 25 LOAD_FAST 1 (x)
28 LOAD_CONST 1 (100)
31 COMPARE_OP 0 (<)
34 POP_JUMP_IF_FALSE 40
5 37 JUMP_ABSOLUTE 7
6 >> 40 LOAD_CONST 1 (100)
43 LOAD_FAST 1 (x)
46 DUP_TOP
47 ROT_THREE
48 COMPARE_OP 0 (<)
51 JUMP_IF_FALSE_OR_POP 63
54 LOAD_CONST 2 (200)
57 COMPARE_OP 0 (<)
60 JUMP_FORWARD 2 (to 65)
>> 63 ROT_TWO
64 POP_TOP
>> 65 POP_JUMP_IF_FALSE 71
7 68 JUMP_ABSOLUTE 7
8 >> 71 LOAD_CONST 3 (300)
74 LOAD_FAST 1 (x)
77 DUP_TOP
78 ROT_THREE
79 COMPARE_OP 0 (<)
82 JUMP_IF_FALSE_OR_POP 94
85 LOAD_CONST 4 (400)
88 COMPARE_OP 0 (<)
91 JUMP_FORWARD 2 (to 96)
>> 94 ROT_TWO
95 POP_TOP
>> 96 POP_JUMP_IF_FALSE 7
9 99 JUMP_ABSOLUTE 7
102 JUMP_ABSOLUTE 7
>> 105 POP_BLOCK
>> 106 LOAD_CONST 0 (None)
109 RETURN_VALUE
Python doesn't optimize things automatically like most other languages (unless you're using PyPy), so the second version is probably faster. Python不会像大多数其他语言一样自动优化事物(除非您使用的是PyPy),因此第二个版本可能更快。 But unless item
has a custom len
implementation that takes a while, it probably won't speed things up that much either. 但是,除非item
具有需要一段时间的自定义len
实现,否则它也可能不会加快速度。 This is the sort of microoptimization that should be reserved for tight inner loops after profiling has indicated a problem. 这种微优化应在分析表明存在问题后保留给紧密的内部循环。
You can check such things with dis.dis
: 您可以使用dis.dis
检查此类内容:
import dis
def somefunction1(item):
if len(item) > 10:
print 1
elif len(item) > 10:
print 2
def somefunction2(item):
x = len(item)
if x > 10:
print 1
elif x > 10:
print 2
print "#1"
dis.dis(somefunction1)
print "#2"
dis.dis(somefunction2)
Interpreting the output: 解释输出:
#1
4 0 LOAD_GLOBAL 0 (len)
3 LOAD_FAST 0 (item)
6 CALL_FUNCTION 1
9 LOAD_CONST 1 (10)
12 COMPARE_OP 4 (>)
15 POP_JUMP_IF_FALSE 26
[...]
6 >> 26 LOAD_GLOBAL 0 (len)
29 LOAD_FAST 0 (item)
32 CALL_FUNCTION 1
35 LOAD_CONST 1 (10)
38 COMPARE_OP 4 (>)
41 POP_JUMP_IF_FALSE 52
[...]
#2
10 0 LOAD_GLOBAL 0 (len)
3 LOAD_FAST 0 (item)
6 CALL_FUNCTION 1
9 STORE_FAST 1 (x)
11 12 LOAD_FAST 1 (x)
15 LOAD_CONST 1 (10)
18 COMPARE_OP 4 (>)
21 POP_JUMP_IF_FALSE 32
[...]
13 >> 32 LOAD_FAST 1 (x)
35 LOAD_CONST 1 (10)
38 COMPARE_OP 4 (>)
41 POP_JUMP_IF_FALSE 52
You can see that in the first example, len(item)
is called twice (see the two CALL_FUNCTION
statements?), whereas it is only called one in the second implementation. 您可以看到在第一个示例中, len(item)
被调用了两次(请参阅两个CALL_FUNCTION
语句?),而在第二个实现中仅被调用了一个。
This means that the rest of your question boils down to how len()
is implemented -- it is O(1) (ie. cheap) for eg lists, but especially for ones you might have built yourself, it need not be. 这意味着您剩下的问题归结为如何实现len()
-例如列表,它是O(1)(即便宜),但是对于您可能自己构建的列表,尤其是它不必。
Python does not make the two equivalent. Python没有将两者等效。 The reason being that the two are not equivalent for an arbitrary function. 原因是两者对于任意函数而言并不等效。 Let's consider this function, x()
: 让我们考虑一下这个函数x()
:
y = 1
def x():
return 1
And these two tests: 而这两个测试:
>>> print(x() + y)
2
>>> print(x() + y)
2
And: 和:
>>> hw = x()
>>> print(hw + y)
2
>>> print(hw + y)
2
These are exactly the same, however, what if our function has side effects? 这些完全相同,但是,如果我们的功能有副作用怎么办?
y = 1
def x():
global y
y += 1
return 1
The first case: 第一种情况:
>>> print(x() + y)
3
>>> print(x() + y)
4
The second case: 第二种情况:
>>> hw = x()
>>> print(hw + y)
3
>>> print(hw + y)
3
You can see that this optimization only works if the function has no side-effects, otherwise it can alter the program. 您可以看到该优化仅在函数没有副作用的情况下有效,否则它将更改程序。 As Python can't tell if a function has side-effects, it can't do this optimization. 由于Python无法判断函数是否具有副作用,因此无法进行此优化。
As such, it makes sense to store the value locally and use it repeatedly, rather than calling the function again and again, although the reality is it is highly unlikely to matter as the difference will be tiny. 因此,有意义的是在本地存储该值并重复使用它,而不是一次又一次地调用该函数,尽管现实情况是不太可能发生问题,因为差异很小。 That said, it's also much more readable and means you don't have to repeat yourself a lot, so it's generally a good idea to code that way. 也就是说,它也更具可读性,意味着您不必重复很多次,因此以这种方式进行编码通常是一个好主意。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.