简体繁体 English

IEEE 754和浮点精度

[英]IEEE 754 and float precision

原文 2013-07-24 16:22:33 6 4 c/ floating-point/ cpu/ 32bit-64bit/ ieee-754

Does it make any difference in having a 32 or 64-bit CPU in the amount of precision that IEEE 754 provides? 在IEEE 754提供的精度方面，拥有32位或64位CPU有什么区别吗？

I mean when programming in C whether the size of float , double and long double are different between a 32 or 64-bit CPU. 我的意思是在C语言中编程时，32位或64位CPU的float ， double和long double的大小是否不同。

4 个解决方案

In most architectures that use IEEE-754, float and double are exact 32 and 64-bit types corresponding to single and double precision respectively. 在大多数使用IEEE-754的体系结构中， float和double是精确的32位和64位类型，分别对应于单精度和双精度。 Therefore the precision is the same whether you're on a 32 or a 64-bit computer . 因此， 无论是在32位还是64位计算机上 ， 精度都是相同的 。 The exceptions are some microcontrollers with non-standard-compliant C compilers where both double and float are the same and contain 32 bits 某些带有非标准C编译器的微控制器是例外，其中double和float都相同并且包含32位

OTOH long double support varies depending on system . OTOH的long double支持因系统而异 。 On x86 most implementations will utilize the hardware 80-bit extended precision type (often padded to 12 or 16 bytes in order to maintain alignment), except MSVC where long double is just an alias for double . 在x86大多数实现将利用硬件的80位扩展精度类型（通常填充为12或16个字节，以保持对准），除了其中MSVC long double只是一个别名double 。 On other architectures long double are often implemented as either 在其他架构上， long double精度通常被实现为

The same type as double , or 与double类型相同，或
128-bit IEEE 754-2008 quadruple precision , or 128位IEEE 754-2008四重精度或
128-bit double-double 128位double-double

While the second way increases both range and precision significantly compared to double , it's also often significantly slower due to the lack of hardware support 尽管第二种方法与double相比，可以显着提高范围和精度，但由于缺乏硬件支持，因此通常也要慢得多

The double-double method results in a type with in the same range but twice the precision of double , with the advantage of hardware double support, ie you don't need to implement entirely in software like quadruple precision. double-double方法产生的类型具有相同的范围，但精度是double ，并且具有硬件double支持的优势，即，您不需要像四精度那样完全在软件中实现。 However it's not IEEE-754 compliant 但是它不符合IEEE-754

If you're doing a lot of math on x86 or arm , moving to 64-bit would benefit because of the increased number of registers, SSE2/Neon available by default... which improves performance compared to the 32-bit version, unlike most other architectures where 64-bit programs often run slower due to bigger pointers. 如果您在x86或arm上进行大量数学运算，则由于寄存器数量的增加，迁移到64位将受益，默认情况下，SSE2 / Neon可用...与32位版本相比，它提高了性能在大多数其他体系结构中，由于指针较大，64位程序通常运行速度较慢。

It is common to most 32-bit and 64-bit machines for float to be IEEE-754 32-bit floating point, and double to be IEEE-754 64-bit floating point. 对于大多数32位和64位计算机来说， float通常是IEEE-754 32位浮点数，而double是IEEE-754 64位浮点数。 Some implementations might use the IEEE-754 80-bit type as double (or long double). 某些实现可能会将IEEE-754 80位类型用作double（或long double）。

No, there is no difference, you can confirm this by checking sizeof(float) across both architectures. 不，没有区别，您可以通过检查两种体系结构的sizeof(float)来确认这一点。 If you need greater precision use double . 如果需要更高的精度，请使用double 。

Assuming float and double map to IEEE-754 single-precision and double-precision numbers respectively, then no, there is no difference. 假设float和double映射到IEEE-754单精度和双精度数，则没有，没有区别。

long double may be a different story, however, since compilers may choose to pad it to an even size. 但是， long double可能是一个不同的故事，因为编译器可能选择将其填充为均匀大小。