简体   繁体   English

为什么int(len(str(k)))比循环快(while)

[英]Why int(len(str(k))) is faster than a loop (while)

I am wondering why this function: 我想知道为什么这个功能:

def digit(k):
    return len(str(k))

is faster than this one ? 比这快吗? :

def digit(k):
    i = 0
    while k != 0:
        k = k // 10
        i += 1
    return i

And why it's the opposite for example in C? 为什么在C语言中却相反呢?

Let's look at what happens if we take your Python code and translate it as literally as possible to C. We can do that very easily with Cython : 让我们看看如果我们采用您的Python代码并将其尽可能地转换为C会发生什么。我们可以使用Cython轻松地做到这一点

# save this in a file named "testmod.pyx" and compile it with Cython and a
# C compiler - details vary depending on OS and Python installation

from libc.stdio cimport snprintf
from libc.string cimport strlen

def c_digit_loop(k_):
    cdef unsigned int k = k_
    cdef int i = 0
    while k != 0:
        k = k // 10
        i += 1
    return i

def c_digit_str(k_):
    cdef unsigned int k = k_
    cdef char strbuf[32] # more than enough for any 'unsigned int'
    snprintf(strbuf, sizeof(strbuf), "%u", k);
    return strlen(strbuf);

The machine code you get from this is not as optimal as it could be, but it's close enough for a quick test. 从中获得的机器代码并不是最佳状态,但是足够接近以进行快速测试。 This allows us to compare performance directly using timeit , like this: 这使我们可以使用timeit直接比较性能,如下所示:

# save this in a file named 'test.py' and run it using the
# same CPython you compiled testmod.pyx against

import timeit
from testmod import c_digit_loop, c_digit_str

def py_digit_loop(k):
    i = 0
    while k != 0:
        k = k // 10
        i += 1
    return i

def py_digit_str(k):
    return len(str(k))

def test1(name):
    print(name, timeit.timeit(name+"(1234567)", "from __main__ import "+name,
                              number=10000))

test1("py_digit_loop")
test1("py_digit_str")
test1("c_digit_str")
test1("c_digit_loop")

When I run this program, this is the output I get on the computer where I'm typing this. 当我运行该程序时,这是我在键入此内容的计算机上得到的输出。 I've manually lined up the numbers to make them easier to compare by eye. 我已经手动将数字排列起来,以使其更容易通过眼睛进行比较。

py_digit_loop 0.004024484000183293
py_digit_str  0.0020454510013223626
c_digit_str   0.0009924650003085844
c_digit_loop  0.00025072999960684683

So that confirms your original assertion: the loop is slower than converting to a string in Python, but in C it's the other way around. 这样可以确认您的原始断言:循环比在Python中转换为字符串慢,但在C语言中则相反。 But notice that converting to a string in C is still faster than converting to a string in Python. 但是请注意,在C中转换为字符串仍然比在Python中转换为字符串更快。

To know precisely why this is happening we would need to dig deeper into the guts of the Python interpreter than I feel like doing this morning, but I know enough about its guts already to tell you in outline. 确切地知道为什么会发生这种情况,我们需要比今天早上做的更深入地研究Python解释器的胆量,但是我已经足够了解它的胆量了,可以概述一下。 The CPython interpreter is not very efficient. CPython解释器不是很有效。 Even operations on small integers involve reference counting and construction of scratch objects on the heap. 甚至对小整数的操作都涉及引用计数和堆上暂存对象的构造。 Your loop that does basic arithmetic in Python requires one or two scratch objects per iteration (depending on whether 0, 1, 2, ... are "interned"). 在Python中执行基本算术的循环,每次迭代需要一个或两个临时对象(取决于0、1、2 ...是否被“内联”)。 Doing the calculation by converting to a string and taking its length involves creating only one temporary object, the string, for the whole calculation. 通过转换为字符串并取其长度进行计算涉及整个计算仅创建一个临时对象,即字符串。 The bookkeeping involved with these scratch objects dwarfs the cost of the actual calculation, for both of the Python implementations . 对于这两个Python实现 ,与这些暂存对象有关的簿记使实际计算的成本相形见war

The C string-based implementation performs almost exactly the same steps that the Python string-based implementation performs, but its scratch object is a char array on the stack, not a full-fledged Python string object, and that all by itself is apparently good for a 40-50% speedup. 基于C字符串的实现几乎执行与基于Python字符串的实现相同的步骤,但是其暂存对象是堆栈上的char数组,而不是完整的Python字符串对象,并且所有这些本身都很好40-50%的加速比。

The C loop-based implementation compiles down to eight machine instructions for the actual loop. 基于C循环的实现可为实际循环最多编译8条机器指令 No memory accesses. 没有内存访问。 Not even a hardware divide instruction (that's the magic of strength reduction ). 甚至没有硬件除法指令(这就是降低强度的魔力)。 And then hundreds more instructions dealing with the Python object model. 然后还有数百条关于Python对象模型的指令。 Most of those 0.00025 seconds are still overhead . 0.00025秒中的大部分时间仍在开销中

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM