如果浮点范围更大，是否始终通过浮点往返定义行为？

Question

Let's say I have two arithmetic types, an integer one, I , and a floating point one, F . 假设我有两种算术类型，即整数I和浮点数F I also assume that std::numeric_limits<I>::max() is smaller than std::numeric_limits<F>::max() . 我还假设std::numeric_limits<I>::max()小于std::numeric_limits<F>::max() 。

Now, let's say I have a positive integer value i . 现在，假设我有一个正整数值i 。 Because the representable range of F is larger than I , F(i) should always be defined behavior. 因为F可表示范围大于I ，所以应始终将F(i)定义为行为。

However, if I have a floating point value f such that f == F(i) , is I(f) well defined? 但是，如果我有一个浮点值f使得f == F(i) ，那么I(f)定义是否正确？ In other words, is I(F(i)) always defined behavior? 换句话说， I(F(i))始终定义为行为？

Relevant section from the C++14 standard: C ++ 14标准的相关部分：

4.9 Floating-integral conversions [conv.fpint] 4.9浮点整数转换 [conf.fpint]

A prvalue of a floating point type can be converted to a prvalue of an integer type. 浮点类型的prvalue可以转换为整数类型的prvalue。 The conversion truncates; 转换被截断； that is, the fractional part is discarded. 即，小数部分被丢弃。 The behavior is undefined if the truncated value cannot be represented in the destination type. 如果无法在目标类型中表示截断的值，则该行为未定义。 [ Note: If the destination type is bool , see 4.12. [ 注意：如果目标类型是bool ，请参阅4.12。 — end note ] — 尾注 ]

A prvalue of an integer type or of an unscoped enumeration type can be converted to a prvalue of a floating point type. 整数类型或无作用域枚举类型的prvalue可以转换为浮点类型的prvalue。 The result is exact if possible. 如果可能，结果是准确的。 If the value being converted is in the range of values that can be represented but the value cannot be represented exactly, it is an implementation-defined choice of either the next lower or higher representable value. 如果要转换的值在可以表示的值的范围内，但不能准确表示该值，则它是下一个较低或较高的可表示值的实现定义选择。 [ Note: Loss of precision occurs if the integral value cannot be represented exactly as a value of the floating type. [ 注意：如果不能将整数值精确表示为浮点型值，则会导致精度损失。 — end note ] If the value being converted is outside the range of values that can be represented, the behavior is undefined. — 结束注释 ]如果要转换的值超出了可以表示的值的范围，则行为是不确定的。 If the source type is bool , the value false is converted to zero and the value true is converted to one. 如果源类型为bool ，则将false值转换为零，将true值转换为1。

Answer 1

However, if I have a floating point value f such that f == F(i) , is I(f) well defined? 但是，如果我有一个浮点值f使得f == F(i) ，那么I(f)定义是否正确？ In other words, is I(F(i)) always defined behavior? 换句话说， I(F(i))始终定义为行为？

No. 没有。

Suppose that I is a signed two's complement 32 bit integer type, F is a 32 bit single precision floating point type, and i is the maximum positive integer. 假设I是一个带符号的二进制补码32位整数类型， F是一个32位单精度浮点类型，而i是最大正整数。 This is within the range of the floating point type, but it cannot be represented exactly as a floating point number. 该值在浮点类型的范围内，但不能完全表示为浮点数。 Some of those 32 bits are used for the exponent. 这32位中的某些位用于指数。

Instead, the conversion from integer to floating point is implementation dependent, but typically is done by rounding to the closest representable value. 取而代之的是，从整数到浮点的转换取决于实现，但是通常通过舍入到最接近的可表示值来完成。 That rounded value is one beyond the range of the integer type. 该舍入值是整数类型范围之外的值。 The conversion back to integer fails (better said, it's undefined behavior). 转换回整数失败（最好说，这是未定义的行为）。

Answer 2

No. 没有。

It's possible that i == std::numeric_limits<I>::max() , but that i is not exactly representable in F . i == std::numeric_limits<I>::max()可能是可能的，但是i不能在F精确表示。

If the value being converted is in the range of values that can be represented but the value cannot be represented exactly, it is an implementation-defined choice of either the next lower or higher representable value. 如果要转换的值在可以表示的值的范围内，但不能准确表示该值，则它是下一个较低或较高的可表示值的实现定义选择。

Since the next higher representable value may be chosen, it's possible that the result F(i) no longer fits into I , so conversion back would be undefined behavior. 由于可以选择下一个更高的可表示值，因此结果F(i)可能不再适合I ，因此转换回将是未定义的行为。

Answer 3

No. Regardless of the standard, you cannot expect that in general this conversion will return your original integer. 否。无论使用哪种标准，您都不能期望此转换通常会返回原始整数。 It doesn't make sense mathematically. 从数学上讲这没有意义。 But if you read into what you quoted, the standard clearly indicates the possibility of a loss of precision upon converting from int to float. 但是，如果您仔细阅读所引用的内容，该标准将清楚地表明从int转换为float时可能会损失精度。

Suppose your types I and F use the same number of bits. 假设您的类型I和F使用相同数量的位。 All of the bits of I (save possibly one that stores the sign) are used to specify the absolute value of the number. I的所有位（可能保存一个存储符号的位）用于指定数字的绝对值。 On the other hand, in F, some bits are used to specify the exponent and some are used for the significand. 另一方面，在F中，有些位用于指定指数，有些位用于有效数。 The range will be greater because of the possible exponent. 由于可能的指数范围会更大。 But the significand will have less precision because there are fewer bits devoted to its specification. 但是有效位数将具有较低的精度，因为专门用于其规范的位将更少。

Just as a test, I printed 作为测试，我打印了

std::numeric_limits<int>::max();
std::numeric_limits<float>::max();

I then converted the first number to float and back again. 然后，我将第一个数字转换为float并再次返回。 The max float had an exponent of 38, and the max int had 10 digits, so clearly float has a larger range. 最大float的指数为38，最大int的位数为10位，因此显然float的范围更大。 But upon converting the max int to float and back, I went from 2147473647 to -2147473648 . 但是在将max int转换为float并返回时，我从2147473647转到-2147473648 。 So it seems the number was incremented by one unit and went around to the negative side. 因此，似乎数字增加了一个单位，然后转为负数。

I didn't check how many bits are actually used for float on my system, but it at least demonstrates the loss of precision, and it shows that gcc "rounded up". 我没有检查系统上实际上有多少位用于浮点运算，但这至少证明了精度的损失，并且它表明gcc是“四舍五入”的。

如果浮点范围更大，是否始终通过浮点往返定义行为？

问题描述

3 个解决方案

解决方案1
4 2015-04-29 00:45:32

解决方案2
-1 已采纳 2015-04-29 00:52:05

解决方案3
-1 2015-04-29 00:58:19

如果浮点范围更大，是否始终通过浮点往返定义行为？

问题描述

3 个解决方案

解决方案1 4 2015-04-29 00:45:32

解决方案2 -1 已采纳 2015-04-29 00:52:05

解决方案3 -1 2015-04-29 00:58:19

解决方案1
4 2015-04-29 00:45:32

解决方案2
-1 已采纳 2015-04-29 00:52:05

解决方案3
-1 2015-04-29 00:58:19