简体   繁体   English

为什么Python中的浮点除法用较小的数字更快?

[英]Why is floating-point division in Python faster with smaller numbers?

In the process of answering this question , I came across something I couldn't explain. 在回答这个问题的过程中,我遇到了一些我无法解释的问题。

Given the following Python 3.5 code: 给出以下Python 3.5代码:

import time

def di(n):
    for i in range(10000000): n / 101

i = 10
while i < 1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000:
    start = time.clock()
    di(i)
    end = time.clock()
    print("On " + str(i) + " " + str(end-start))
    i *= 10000

The output is: 输出是:

On 10 0.546889
On 100000 0.545004
On 1000000000 0.5454929999999998
On 10000000000000 0.5519709999999998
On 100000000000000000 1.330797
On 1000000000000000000000 1.31053
On 10000000000000000000000000 1.3393129999999998
On 100000000000000000000000000000 1.3524339999999997
On 1000000000000000000000000000000000 1.3817269999999997
On 10000000000000000000000000000000000000 1.3412670000000002
On 100000000000000000000000000000000000000000 1.3358929999999987
On 1000000000000000000000000000000000000000000000 1.3773859999999996
On 10000000000000000000000000000000000000000000000000 1.3326890000000002
On 100000000000000000000000000000000000000000000000000000 1.3704769999999993
On 1000000000000000000000000000000000000000000000000000000000 1.3235019999999995
On 10000000000000000000000000000000000000000000000000000000000000 1.357647
On 100000000000000000000000000000000000000000000000000000000000000000 1.3341190000000012
On 1000000000000000000000000000000000000000000000000000000000000000000000 1.326544000000002
On 10000000000000000000000000000000000000000000000000000000000000000000000000 1.3671139999999973
On 100000000000000000000000000000000000000000000000000000000000000000000000000000 1.3630120000000012
On 1000000000000000000000000000000000000000000000000000000000000000000000000000000000 1.3600200000000022
On 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000 1.3189189999999975
On 100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 1.3503469999999993

As you can see, there are roughly two times: one for smaller numbers, and one for larger numbers. 如您所见,大约有两次:一次是较小的数字,另一次是较大的数字。

The same result happens with Python 2.7 using the following function to preserve semantics: 使用以下函数保存语义时,Python 2.7也会出现相同的结果:

def di(n):
    for i in xrange(10000000): n / 101.0

On the same machine, I get: 在同一台机器上,我得到:

On 10 0.617427
On 100000 0.61805
On 1000000000 0.6366
On 10000000000000 0.620919
On 100000000000000000 0.616695
On 1000000000000000000000 0.927353
On 10000000000000000000000000 1.007156
On 100000000000000000000000000000 0.98597
On 1000000000000000000000000000000000 0.99258
On 10000000000000000000000000000000000000 0.966753
On 100000000000000000000000000000000000000000 0.992684
On 1000000000000000000000000000000000000000000000 0.991711
On 10000000000000000000000000000000000000000000000000 0.994703
On 100000000000000000000000000000000000000000000000000000 0.978877
On 1000000000000000000000000000000000000000000000000000000000 0.982035
On 10000000000000000000000000000000000000000000000000000000000000 0.973266
On 100000000000000000000000000000000000000000000000000000000000000000 0.977911
On 1000000000000000000000000000000000000000000000000000000000000000000000 0.996857
On 10000000000000000000000000000000000000000000000000000000000000000000000000 0.972555
On 100000000000000000000000000000000000000000000000000000000000000000000000000000 0.985676
On 1000000000000000000000000000000000000000000000000000000000000000000000000000000000 0.987412
On 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0.997207
On 100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 0.970129

Why is there this consistent difference between floating point division of smaller vs. larger numbers? 为什么较小数字与较大数字的浮点除法之间存在这种一致差异? Does it have to do with Python internally using floats for smaller numbers and doubles for larger ones? 它是否与Python内部使用浮点数较小的数字和双数字较大的数字?

It has more to do with Python storing exact integers as Bignums. 它更多地与Python存储精确整数作为Bignums。

In Python 2.7, computation of integer a / float fb , starts by converting the integer to a float. 在Python 2.7中,计算整数a / float fb ,首先将整数转换为float。 If the integer is stored as a Bignum [Note 1] then this takes longer. 如果整数存储为Bignum [注1]则需要更长时间。 So it's not the division that has differential cost; 所以这不是具有差别成本的部门; it is the conversion of the integer (possibly a Bignum) to a double. 它是整数(可能是Bignum)到double的转换。

Python 3 does the same computation for integer a / float fb , but with integer a / integer b , it tries to compute the closest representable result, which might differ slightly from the naive float(a) / float(b) . Python 3对整数a / float fb执行相同的计算,但是对于整数a / integer b ,它尝试计算最接近的可表示结果,这可能与天真float(a) / float(b)略有不同。 (This is similar to the classic double-rounding problem.) (这类似于经典的双舍入问题。)

If both float(a) and float(b) are precise (that is, both a and b are no larger than 53 bits), then the naive solution works, and the result only requires the division of two double-precision floats. 如果float(a)float(b)都是精确的(也就是说, ab都不大于53位),那么天真的解决方案就可以工作,结果只需要分割两个双精度浮点数。

Otherwise, a multiprecision division is performed to generate the correct 53-bit mantissa (the exponent is computed separately), and the result is converted precisely to a floating point number. 否则,执行多精度分割以生成正确的53位尾数(指数被单独计算),并且结果被精确地转换为浮点数。 There are two possibilities for this division: a fast-track if b is small enough to fit in a single Bignum unit (which applies to the benchmark in the OP), and a slower, general Bignum division when b is larger. 这种划分有两种可能性:如果b足够小以适合单个Bignum单位(适用于OP中的基准),则快速跟踪;以及当b较大时,较慢的一般Bignum除法。

In none of the above cases is the speed difference observed related to the speed with which the hardware performs floating point division. 在上述情况中,没有观察到与硬件执行浮点除法的速度有关的速度差。 For the original Python 3.5 test, the difference relates to whether floating point or Bignum division is performed; 对于原始的Python 3.5测试,差异与是否执行浮点或Bignum除法有关; for the Python 2.7 case, the difference relates to the necessity to convert a Bignum to a double. 对于Python 2.7的情况,差异与将Bignum转换为double的必要性有关。

Thanks to @MarkDickinson for the clarification, and the pointer to the source code (with a long and useful comment) which implements the algorithm. 感谢@MarkDickinson的澄清,以及指向实现算法的源代码(带有长而有用的注释)的指针。


Notes 笔记

  1. In Python 3, integers are always stored as Bignums. 在Python 3中,整数总是存储为Bignums。 Python 2 has separate types for int (64-bit integers) and long (Bignums). Python 2具有int (64位整数)和long (Bignums)的单独类型。 In practice, since Python 3 often uses optimized algorithms when the Bignum has only one "leg", the difference between "small" and "big" integers is still noticeable. 实际上,由于Python 3经常使用优化算法,而Bignum只有一个“腿”,“小”和“大”整数之间的差异仍然很明显。

It's the larger integer format, as @rici said. 正如@rici所说,它是更大的整数格式。 I changed the initial 10 to 10.0 ... here's the result, no significant change in timing. 我将最初的10改为10.0 ......这是结果,时间没有显着变化。

On 10.0 1.12
On 100000.0 0.79
On 1000000000.0 0.79
On 1e+13 0.77
On 1e+17 0.78
On 1e+21 0.79
On 1e+25 0.77
On 1e+29 0.8
On 1e+33 0.77
On 1e+37 0.8
On 1e+41 0.78
On 1e+45 0.78
On 1e+49 0.78
On 1e+53 0.79
On 1e+57 0.77
On 1e+61 0.8
On 1e+65 0.77
On 1e+69 0.79
On 1e+73 0.77
On 1e+77 0.78
On 1e+81 0.78
On 1e+85 0.78
On 1e+89 0.77

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM