[英]Is float slower than double? Does 64 bit program run faster than 32 bit program?
Is using float
type slower than using double
type? 使用float
类型比使用double
类型慢吗?
I heard that modern Intel and AMD CPUs can do calculations with doubles faster than with floats. 我听说现代的Intel和AMD CPU可以比使用浮点数更快地进行双倍计算。
What about standard math functions ( sqrt
, pow
, log
, sin
, cos
, etc.)? 标准数学函数( sqrt
, pow
, log
, sin
, cos
等)怎么样? Computing them in single-precision should be considerably faster because it should require less floating-point operations. 以单精度计算它们应该相当快,因为它应该需要更少的浮点运算。 For example, single precision sqrt
can use simpler math formula than double precision sqrt
. 例如,单精度sqrt
可以使用比双精度sqrt
更简单的数学公式。 Also, I heard that standard math functions are faster in 64 bit mode (when compiled and run on 64 bit OS). 另外,我听说标准数学函数在64位模式下更快(在64位操作系统上编译和运行时)。 What is the definitive answer on this? 对此有何明确答案?
The classic x86 architecture uses floating-point unit (FPU) to perform floating-point calculations. 经典的x86架构使用浮点单元(FPU)来执行浮点计算。 The FPU performs all calculations in its internal registers, which have 80-bit precision each. FPU在其内部寄存器中执行所有计算,每个寄存器具有80位精度。 Every time you attempt to work with float
or double
, the variable is first loaded from memory into the internal register of the FPU. 每次尝试使用float
或double
,变量首先从内存加载到FPU的内部寄存器中。 This means that there is absolutely no difference in the speed of the actual calculations, since in any case the calculations are carried out with full 80-bit precision. 这意味着实际计算的速度绝对没有差异,因为在任何情况下,计算都以完全80位精度执行。 The only thing that might be different is the speed of loading the value from memory and storing the result back to memory. 唯一可能不同的是从内存加载值并将结果存储回内存的速度。 Naturally, on a 32-bit platform it might take longer to load/store a double
as compared to float
. 当然,在32位平台上,与float
相比,加载/存储double
可能需要更长的时间。 On a 64-bit platform there shouldn't be any difference. 在64位平台上应该没有任何区别。
Modern x86 architectures support extended instruction sets (SSE/SSE2) with new instructions that can perform the very same floating-point calculations without involving the "old" FPU instructions. 现代x86架构支持带有新指令的扩展指令集(SSE / SSE2),这些指令可以执行完全相同的浮点计算,而不涉及“旧”FPU指令。 However, again, I wouldn't expect to see any difference in calculation speed for float
and double
. 但是,我再也不希望float
和double
计算速度有任何差异。 And since these modern platforms are 64-bit ones, the load/store speed is supposed to be the same as well. 由于这些现代平台是64位的,因此加载/存储速度应该是相同的。
On a different hardware platform the situation could be different. 在不同的硬件平台上,情况可能会有所不同。 But normally a smaller floating-point type should not provide any performance benefits. 但通常较小的浮点类型不应提供任何性能优势。 The main purpose of smaller floating-point types is to save memory, not to improve performance. 较小的浮点类型的主要目的是节省内存,而不是提高性能。
Edit: (To address @MSalters comment) What I said above applies to fundamental arithmetical operations. 编辑:(解决@MSalters评论)我上面所说的内容适用于基本的算术运算。 When it comes to library functions, the answer will depend on several implementation details. 说到库函数,答案取决于几个实现细节。 If the platform's floating-point instruction set contains an instruction that implements the functionality of the given library function, then what I said above will normally apply to that function as well (that would normally include functions like sin
, cos
, sqrt
). 如果平台的浮点指令集包含一个实现给定库函数功能的指令,那么我上面所说的通常也适用于该函数(通常包括sin
, cos
, sqrt
等函数)。 For other functions, whose functionality is not immediately supported in the FP instruction set, the situation might prove to be significantly different. 对于FP指令集中不立即支持其功能的其他功能,情况可能会有很大差异。 It is quite possible that float
versions of such functions can be implemented more efficiently than their double
versions. 这些函数的float
版本很可能比double
版本更有效地实现。
Your first question has already been answer here on SO . 你的第一个问题已在SO上回答了 。
Your second question is entirely dependent on the "size" of the data you are working with. 您的第二个问题完全取决于您正在使用的数据的“大小”。 It all boils down to the low level architecture of the system and how it handles large values. 这一切都归结为系统的低级架构以及它如何处理大值。 64-bits of data in a 32 bit system would require 2 cycles to access 2 registers. 32位系统中的64位数据需要2个周期才能访问2个寄存器。 The same data on a 64 bit system should only take 1 cycle to access 1 register. 64位系统上的相同数据只需1个周期即可访问1个寄存器。
Everything always depends on what you're doing. 一切都取决于你在做什么。 I find there are no fast and hard rules so you need to analyze the current task and choose what works best for your needs for that specific task. 我发现没有快速和严格的规则,因此您需要分析当前任务并选择最适合您特定任务需求的任务。
From some research and empirical measurements I have made in Java: 从我在Java中进行的一些研究和实证测量:
It is also true that there may be special circumstances in which eg memory bandwidth issues outweigh "raw" calculation times. 确实存在特殊情况,例如内存带宽问题超过“原始”计算时间。
The "native" internal floating point representation in the x86 FPU is 80 bits wide. x86 FPU中的“本机”内部浮点表示为80位宽。 This is different from both float
(32 bits) and double
(64 bits). 这与float
(32位)和double
(64位)都不同。 Every time a value moves in or out of the FPU, a conversion is performed. 每当值移入或移出FPU时,都会执行转换。 There is only one FPU instruction that performs a sin operation, and it works on the internal 80 bit representation. 只有一个FPU指令执行sin操作,它适用于内部80位表示。
Whether this conversion is faster for float
or for double
depends on many factors, and must be measured for a given application. float
或double
转换是否更快取决于许多因素,并且必须针对给定的应用进行测量。
While on most systems double
will be the same speed as float
for individual values, you're right that computing functions like sqrt
, sin
, etc. in single-precision should be a lot faster than computing them to double-precision. 虽然在大多数系统中, double
对于单个值的float
速度与float
相同,但你认为单精度计算函数如sqrt
, sin
等应该比将它们计算为双精度要快得多。 In C99, you can use the sqrtf
, sinf
, etc. functions even if your variables are double
, and get the benefit. 在C99中,即使变量是double
,也可以使用sqrtf
, sinf
等函数,并获得好处。
Another issue I've seen mentioned is memory (and likewise storage device) bandwidth. 我见过的另一个问题是内存(以及存储设备)带宽。 If you have millions or billions of values to deal with, float
will almost certainly be twice as fast as double
since everything will be memory-bound or io-bound. 如果你有数百万或数十亿的值要处理, float
几乎肯定会是double
因为一切都将受到内存限制或io-bound。 This is a good reason to use float
as the type in an array or on-disk storage in some cases, but I would not consider it a good reason to use float
for the variables you do your computations with. 在某些情况下,这是使用float
作为数组或磁盘存储中的类型的一个很好的理由,但我不认为将float
用于您进行计算的变量是一个很好的理由。
It depends on the processor. 这取决于处理器。 If the processor has native double-precision instructions, it'll usually be faster to just do double-precision arithmetic than to be given a float, convert it to a double, do the double-precision arithmetic, then convert it back to a float. 如果处理器具有本机双精度指令,那么执行双精度算术通常比给定浮点数更快,将其转换为double,执行双精度算术,然后将其转换回浮点数。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.