简体   繁体   English

Python:多次调用内置函数的有效方法?

[英]Python: Efficient way to call inbuilt function multiple times?

I have a code that looks something like this: 我有一个看起来像这样的代码:

def somefunction(somelist):
    for item in somelist:
        if len(item) > 10:
            do something
        elif len(item) > 6:
            do something
        elif len(item) > 3:
            do something
        else:
            do something

Since I am calling len(item) multiple times, is it inefficient to do it this way? 由于我多次调用len(item),以这种方式执行效率低下吗? Would it be preferable to write the code as follows, or are they EXACTLY the same in performance? 编写下面的代码还是更好,还是在性能上完全一样?

def somefunction(somelist):
    for item in somelist:
        x = len(item)
        if x > 10:
            do something
        elif x > 6:
            do something
        elif x > 3:
            do something
        else:
            do something

len() is O(1) operation. len()是O(1)运算。 This mean the cost of calling len( ) is very cheap. 这意味着调用len()的成本非常便宜。 So, stop worrying about it and better improve other part of your code. 因此,不必再为此担心,可以更好地改进代码的其他部分。

However, personally, I think the second way is better. 但是,就我个人而言,我认为第二种方法更好。 Because if I change your variable name from x to length , it will increase your code's readability. 因为如果我将变量名从x更改为length ,它将增加代码的可读性。

def somefunction(somelist):
    for item in somelist:
        length = len(item)
        if length > 10:
            do something
        elif length > 6:
            do something
        elif length > 3:
            do something
        else:
            do something

NOTE: len( ) is O(1) with strings, sets, and dictionaries. 注意: len( )是O(1),带有字符串,集合和字典。

The second approach is surely better, as the number of calls to len() are reduced: 第二种方法肯定更好,因为减少了对len()的调用次数:

In [16]: import dis

In [18]: lis=["a"*10000,"b"*10000,"c"*10000]*1000

In [19]: def first():
    for item in lis:
        if len(item)<100:
            pass
        elif 100<len(item)<200:
            pass
        elif 300<len(item)<400:
            pass
   ....:         

In [20]: def second():
    for item in lis:
        x=len(item)
        if x<100:
                pass
        elif 100<x<200:
                pass
        elif 300<x<400:
                pass
   ....:         

You can always time your code using timeit module: 您始终可以使用timeit模块为代码计时:

In [21]: %timeit first()
100 loops, best of 3: 2.03 ms per loop

In [22]: %timeit second()
1000 loops, best of 3: 1.66 ms per loop

Use dis.dis() to see disassembling of Python byte code into mnemonics 使用dis.dis()查看将Python字节码反汇编为助记符

In [24]: dis.dis(first)
  2           0 SETUP_LOOP             109 (to 112)
              3 LOAD_GLOBAL              0 (lis)
              6 GET_ITER            
        >>    7 FOR_ITER               101 (to 111)
             10 STORE_FAST               0 (item)

  3          13 LOAD_GLOBAL              1 (len)
             16 LOAD_FAST                0 (item)
             19 CALL_FUNCTION            1
             22 LOAD_CONST               1 (100)
             25 COMPARE_OP               0 (<)
             28 POP_JUMP_IF_FALSE       34

  4          31 JUMP_ABSOLUTE            7

  5     >>   34 LOAD_CONST               1 (100)
             37 LOAD_GLOBAL              1 (len)
             40 LOAD_FAST                0 (item)
             43 CALL_FUNCTION            1
             46 DUP_TOP             
             47 ROT_THREE           
             48 COMPARE_OP               0 (<)
             51 JUMP_IF_FALSE_OR_POP    63
             54 LOAD_CONST               2 (200)
             57 COMPARE_OP               0 (<)
             60 JUMP_FORWARD             2 (to 65)
        >>   63 ROT_TWO             
             64 POP_TOP             
        >>   65 POP_JUMP_IF_FALSE       71

  6          68 JUMP_ABSOLUTE            7

  7     >>   71 LOAD_CONST               3 (300)
             74 LOAD_GLOBAL              1 (len)
             77 LOAD_FAST                0 (item)
             80 CALL_FUNCTION            1
             83 DUP_TOP             
             84 ROT_THREE           
             85 COMPARE_OP               0 (<)
             88 JUMP_IF_FALSE_OR_POP   100
             91 LOAD_CONST               4 (400)
             94 COMPARE_OP               0 (<)
             97 JUMP_FORWARD             2 (to 102)
        >>  100 ROT_TWO             
            101 POP_TOP             
        >>  102 POP_JUMP_IF_FALSE        7

  8         105 JUMP_ABSOLUTE            7
            108 JUMP_ABSOLUTE            7
        >>  111 POP_BLOCK           
        >>  112 LOAD_CONST               0 (None)
            115 RETURN_VALUE        

In [25]: dis.dis(second)
  2           0 SETUP_LOOP             103 (to 106)
              3 LOAD_GLOBAL              0 (lis)
              6 GET_ITER            
        >>    7 FOR_ITER                95 (to 105)
             10 STORE_FAST               0 (item)

  3          13 LOAD_GLOBAL              1 (len)
             16 LOAD_FAST                0 (item)
             19 CALL_FUNCTION            1
             22 STORE_FAST               1 (x)

  4          25 LOAD_FAST                1 (x)
             28 LOAD_CONST               1 (100)
             31 COMPARE_OP               0 (<)
             34 POP_JUMP_IF_FALSE       40

  5          37 JUMP_ABSOLUTE            7

  6     >>   40 LOAD_CONST               1 (100)
             43 LOAD_FAST                1 (x)
             46 DUP_TOP             
             47 ROT_THREE           
             48 COMPARE_OP               0 (<)
             51 JUMP_IF_FALSE_OR_POP    63
             54 LOAD_CONST               2 (200)
             57 COMPARE_OP               0 (<)
             60 JUMP_FORWARD             2 (to 65)
        >>   63 ROT_TWO             
             64 POP_TOP             
        >>   65 POP_JUMP_IF_FALSE       71

  7          68 JUMP_ABSOLUTE            7

  8     >>   71 LOAD_CONST               3 (300)
             74 LOAD_FAST                1 (x)
             77 DUP_TOP             
             78 ROT_THREE           
             79 COMPARE_OP               0 (<)
             82 JUMP_IF_FALSE_OR_POP    94
             85 LOAD_CONST               4 (400)
             88 COMPARE_OP               0 (<)
             91 JUMP_FORWARD             2 (to 96)
        >>   94 ROT_TWO             
             95 POP_TOP             
        >>   96 POP_JUMP_IF_FALSE        7

  9          99 JUMP_ABSOLUTE            7
            102 JUMP_ABSOLUTE            7
        >>  105 POP_BLOCK           
        >>  106 LOAD_CONST               0 (None)
            109 RETURN_VALUE   

Python doesn't optimize things automatically like most other languages (unless you're using PyPy), so the second version is probably faster. Python不会像大多数其他语言一样自动优化事物(除非您使用的是PyPy),因此第二个版本可能更快。 But unless item has a custom len implementation that takes a while, it probably won't speed things up that much either. 但是,除非item具有需要一段时间的自定义len实现,否则它也可能不会加快速度。 This is the sort of microoptimization that should be reserved for tight inner loops after profiling has indicated a problem. 这种微优化应在分析表明存在问题后保留给紧密的内部循环。

You can check such things with dis.dis : 您可以使用dis.dis检查此类内容:

import dis

def somefunction1(item):
    if len(item) > 10:
        print 1
    elif len(item) > 10:
        print 2

def somefunction2(item):
    x = len(item)
    if x > 10:
        print 1
    elif x > 10:
        print 2

print "#1"
dis.dis(somefunction1)

print "#2"
dis.dis(somefunction2)

Interpreting the output: 解释输出:

#1
  4           0 LOAD_GLOBAL              0 (len)
              3 LOAD_FAST                0 (item)
              6 CALL_FUNCTION            1
              9 LOAD_CONST               1 (10)
             12 COMPARE_OP               4 (>)
             15 POP_JUMP_IF_FALSE       26
[...]
  6     >>   26 LOAD_GLOBAL              0 (len)
             29 LOAD_FAST                0 (item)
             32 CALL_FUNCTION            1
             35 LOAD_CONST               1 (10)
             38 COMPARE_OP               4 (>)
             41 POP_JUMP_IF_FALSE       52
[...]
#2
 10           0 LOAD_GLOBAL              0 (len)
              3 LOAD_FAST                0 (item)
              6 CALL_FUNCTION            1
              9 STORE_FAST               1 (x)

 11          12 LOAD_FAST                1 (x)
             15 LOAD_CONST               1 (10)
             18 COMPARE_OP               4 (>)
             21 POP_JUMP_IF_FALSE       32
[...]
 13     >>   32 LOAD_FAST                1 (x)
             35 LOAD_CONST               1 (10)
             38 COMPARE_OP               4 (>)
             41 POP_JUMP_IF_FALSE       52

You can see that in the first example, len(item) is called twice (see the two CALL_FUNCTION statements?), whereas it is only called one in the second implementation. 您可以看到在第一个示例中, len(item)被调用了两次(请参阅两个CALL_FUNCTION语句?),而在第二个实现中仅被调用了一个。

This means that the rest of your question boils down to how len() is implemented -- it is O(1) (ie. cheap) for eg lists, but especially for ones you might have built yourself, it need not be. 这意味着您剩下的问题归结为如何实现len() -例如列表,它是O(1)(即便宜),但是对于您可能自己构建的列表,尤其是它不必。

Python does not make the two equivalent. Python没有将两者等效。 The reason being that the two are not equivalent for an arbitrary function. 原因是两者对于任意函数而言并不等效。 Let's consider this function, x() : 让我们考虑一下这个函数x()

y = 1

def x():
    return 1

And these two tests: 而这两个测试:

>>> print(x() + y)
2
>>> print(x() + y)
2

And: 和:

>>> hw = x()
>>> print(hw + y)
2
>>> print(hw + y)
2

These are exactly the same, however, what if our function has side effects? 这些完全相同,但是,如果我们的功能有副作用怎么办?

y = 1

def x():
    global y
    y += 1
    return 1

The first case: 第一种情况:

>>> print(x() + y)
3
>>> print(x() + y)
4

The second case: 第二种情况:

>>> hw = x()
>>> print(hw + y)
3
>>> print(hw + y)
3 

You can see that this optimization only works if the function has no side-effects, otherwise it can alter the program. 您可以看到该优化仅在函数没有副作用的情况下有效,否则它将更改程序。 As Python can't tell if a function has side-effects, it can't do this optimization. 由于Python无法判断函数是否具有副作用,因此无法进行此优化。

As such, it makes sense to store the value locally and use it repeatedly, rather than calling the function again and again, although the reality is it is highly unlikely to matter as the difference will be tiny. 因此,有意义的是在本地存储该值并重复使用它,而不是一次又一次地调用该函数,尽管现实情况是不太可能发生问题,因为差异很小。 That said, it's also much more readable and means you don't have to repeat yourself a lot, so it's generally a good idea to code that way. 也就是说,它也更具可读性,意味着您不必重复很多次,因此以这种方式进行编码通常是一个好主意。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将具有相同值的参数传递给多次调用的函数的有效代码编写方法是什么? (蟒蛇) - What is the efficient way to write code where argument with same value is passed to a function that is called multiple times ? (python) 是否有更有效的方法多次调用具有不同参数的函数? - Is there a more efficient way to call functions with different arguments multiple times? 在 Python 中调用 function 数百万次的最快方法 - Fastest way to call a function millions of times in Python Python装饰器调用函数多次 - Python decorator call function multiple times 如何在 Python 中多次调用 function? - How to call a function multiple times in Python? 在Python中的可迭代对象上调用多个reduce函数的有效方法? - Efficient way to call multiple reduce functions on an iterable in Python? 多次传递此变量的有效方法 - Efficient way to pass this variable multiple times 有没有办法像 print() 这样在 Python 内置 function 上设置调试器断点? - Is there a way to set a debugger breakpoint on a Python inbuilt function like print()? Python:通过函数放置多个变量的有效方法 - Python: Efficient way to put multiple variables through a function 在 Python 中多次拆分 dataframe 的有效方法? - Efficient method to split dataframe multiple times in Python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM