简体繁体 English

双精度浮点数如何转换为单精度浮点格式？

[英]How are double-precision floating-point numbers converted to single-precision floating-point format?

原文 2012-08-02 07:19:46 8 1 floating-point/ type-conversion/ ieee-754/ double-precision/ single-precision

Converting numbers from double-precision floating-point format to single-precision floating-point format results in loss of precision. 将数字从双精度浮点格式转换为单精度浮点格式会导致精度下降。 What's the algorithm used to achieve this conversion? 实现此转换的算法是什么？

Are numbers greater than 3.4028234e+38 or lesser than -3.4028234e+38 simply reduced to the respective limits? 大于3.4028234e+38或小于-3.4028234e+38数字是否只是简单地减少到各自的限制？ I feel that the conversion process is a bit more involved than this but I couldn't find documentation for it. 我觉得转换过程比这要复杂得多，但是我找不到它的文档。

1 个解决方案

The most common floating-point formats are the binary floating-point formats specified in the IEEE 754 standard. 最常见的浮点格式是IEEE 754标准中指定的二进制浮点格式。 I will answer your question for these formats. 我将针对这些格式回答您的问题。 There are also decimal floating-point formats in the new (2008) version of the standard, and there are formats other than the IEEE 754 standard, but the 754 binary formats are by far the most common. 新的（2008）版本的标准中也有十进制浮点格式，除了IEEE 754标准以外，还有其他格式，但是754二进制格式是最常见的格式。 Some information about rounding, and links to the standard, are in this Wikipedia page . 有关舍入的一些信息以及指向标准的链接，请参见Wikipedia页面。

Converting double precision to single precision is treated the same as rounding the result of any operation. 将双精度转换为单精度的方法与四舍五入任何运算结果的方法相同。 (Eg, an addition, multiplication, or square root has an exact mathematical value, and that value is rounded according to the rules to produce the result returned from the operation. For purposes of conversion, the input value is the exact mathematical value, and it is rounded.) （例如，加，乘或平方根具有精确的数学值，并且该值会根据规则进行舍入以产生从操作返回的结果。出于转换目的，输入值是精确的数学值，并且它是四舍五入的。）

Generally, the computing environment has some default rounding mode. 通常，计算环境具有一些默认的舍入模式。 (Various programming languages may provide ways to change the default rounding mode or to specify it particularly with each operation.) The default rounding mode is commonly round-to-nearest. （各种编程语言可能提供更改默认舍入模式或在每次操作时特别指定它的方法。）默认舍入模式通常是最近舍入。 Others are round-toward-zero, round-toward-positive-infinity (upward), and round-toward-negative-infinity (downward). 其他的则是：舍入为零，舍入为正无穷大（向上）和舍入为负无穷大（向下）。

In round-to-nearest mode, the representable number nearest the exact value is returned. 在最近舍入模式下，将返回最接近精确值的可表示数字。 If there is a tie, then the number with the even low bit (in its fraction or significand) is returned. 如果存在平局，则返回具有偶数低位（分数或有效位数）的数字。 For this purpose, infinity effectively acts as if it were the next value in the pattern of finite numbers. 为此，无穷有效地充当了有限数模式中的下一个值。 In single-precision, the greatest finite numbers are 0x1.fffff8p127, 0x1.fffffap127, 0x1.fffffcp127, and 0x1.fffffep127. 在单精度中，最大有限数为0x1.fffff8p127、0x1.fffffap127、0x1.fffffcp127和0x1.fffffep127。 (There are 24 bits in the single-precision significand, so a step in that bit is a step of 2 in that last hexadecimal digit.) For rounding purposes, infinity acts as if it were at 0x2p128. （单精度有效数有24位，因此该位的步长是最后一个十六进制数字2的步长。）出于四舍五入的目的，无穷大就好像在0x2p128处一样。 So, if the exact result is closer to 0x1.fffffep127 (thus, less than 0x1.ffffffp127), it is rounded to 0x1.fffffep127. 因此，如果精确结果更接近于0x1.fffffep127（因此小于0x1.ffffffp127），则将其舍入为0x1.fffffep127。 If it is greater than or equal to 0x1.ffffffp127, it is rounded to infinity. 如果大于或等于0x1.ffffffp127，则将其舍入为无穷大。 The situation for negative infinity is symmetric. 负无穷大的情况是对称的。

In round-toward-infinity mode, the nearest representable value that is greater than or equal to the exact value is returned. 在无穷大舍入模式下，将返回大于或等于精确值的最接近的可表示值。 So any value above 0x1.fffffep127 rounds to infinity. 因此，任何大于0x1.fffffep127的值都将四舍五入为无穷大。 Round-toward-negative-infinity returns the nearest representable value that is less than or equal to the exact vaue. 舍入到负无穷返回小于或等于精确值的最接近的可表示值。 Round-toward-zero returns the nearest representable value in the direction toward zero. 四舍五入将在朝零的方向上返回最接近的可表示值。

The IEEE 754 standard only specifies the result; IEEE 754标准仅指定结果。 it does not specify the algorithm. 它没有指定算法。 The method used to achieve the rounding is up to each implementation. 用于舍入的方法取决于每个实现。