将int转换为在C中浮动时的奇怪行为

Question

I have a doubt concerning the output of the following C program. 我对以下C程序的输出有疑问。 I tried to compile it using both Visual C++ 6.0 and MinGW32 (gcc 3.4.2). 我尝试使用Visual C ++ 6.0和MinGW32（gcc 3.4.2）编译它。

#include <stdio.h>

int main() {
    int x = 2147483647;
    printf("%f\n", (float)2147483647);
    printf("%f\n", (float)x);
    return 0;
}

The output is: 输出是：

2147483648.000000
2147483647.000000

My question is: why are both lines different? 我的问题是：为什么两条线都不同？ When you convert the integer value 2147483647 to the IEEE 754 floating-point format, it gets approximated to 2147483648.0. 将整数值2147483647转换为IEEE 754浮点格式时，它将近似为2147483648.0。 So, I expected that both lines would be equal to 2147483648.000000. 所以，我预计这两行都将等于2147483648.000000。

EDIT : The value "2147483647.000000" can't be a single-precision floating-point value, since the number 2147483647 can't be represented exactly in the IEEE 754 single-precision floating-point format without loss of precision. 编辑：值“2147483647.000000”不能是单精度浮点值，因为数字2147483647不能精确地以IEEE 754单精度浮点格式表示而不会损失精度。

Answer 1

In both cases, code seeks to convert from some integer type to float and then to double .. The double conversion occurs as it is a float value passed to a variadic function. 在这两种情况下，代码都试图从某种整数类型转换为float然后再转换为double 。 double转换发生，因为它是传递给可变函数的float值。

Check your setting of FLT_EVAL_METHOD , suspect it has a value of 1 or 2 (OP reports 2 with at least one compiler). 检查FLT_EVAL_METHOD的设置，怀疑它的值为1或2（OP报告2至少有一个编译器）。 This allows the compiler to evaluate float "... operations and constants to the range and precision" greater than float . 这允许编译器评估float “...操作和常量到范围和精度”大于float 。

Your compiler optimized (float)x going directly int to double arithmetic. 您的编译器优化(float)x直接将int为double算术。 This is a performance improvement during run-time. 这是运行期间的性能改进。

(float)2147483647 is a compile time cast and the compiler optimized for int to float to double accuracy as performance is not an issue here. (float)2147483647是一个编译时间转换，并且编译器针对int优化float到double精度，因为性能在这里不是问题。

[Edit2] It is interesting that the C11 spec is more specific than the C99 spec with the addition of "Except for assignment and cast ...". [编辑2]有趣的是，C11规范比C99规范更具体，增加了“除了赋值和转换......”。 This implies that C99 compilers were sometimes allowing the int to double direct conversion, without first going through float and that C11 was amended to clearly not allow skipping a cast. 这意味着C99编译器有时允许int进行double直接转换，而不首先通过float并且C11被修改为显然不允许跳过转换。

With C11 formally excluding this behavior, modern compliers should not do this, but older ones, like OP's might - thus a bug by C11 standards. 由于C11正式排除了这种行为，现代的编制者不应该这样做，但是较旧的，比如OP可能 - 因此是C11标准的错误。 Unless some other C99 or C89 specification is found to say other-wise, this appears to be allowable compiler behavior. 除非发现某些其他C99或C89规范另有说明，否则这似乎是允许的编译器行为。

[Edit] Taking comments together by @Keith Thompson, @tmyklebu, @Matt McNabb, the compiler, even with a non-zero FLT_EVAL_METHOD , should be expected to produce 2147483648.0... . [编辑] @Keith Thompson，@ tmyklebu，@ Matt McNabb一起评论，编译器，即使是非零FLT_EVAL_METHOD ，也应该产生2147483648.0... Thus either a compiler optimization flag is explicitly over-riding correct behavior or the compiler has a corner bug. 因此，编译器优化标志明确地覆盖了正确的行为，或者编译器有一个角落错误。

C99dr §5.2.4.2.2 8 The values of operations with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type. C99dr§5.2.4.2.28具有浮动操作数的操作值和通常算术转换以及浮动常量的值将被评估为其范围和精度可能大于该类型所需的格式。 The use of evaluation formats is characterized by the implementation-defined value of FLT_EVAL_METHOD : 评估格式的使用以FLT_EVAL_METHOD的实现定义值为特征：

-1 indeterminable; -1不确定;

0 evaluate all operations and constants just to the range and precision of the type; 0仅根据类型的范围和精度评估所有操作和常量;

1 evaluate operations and constants of type float and double to the range and precision of the double type, evaluate long double operations and constants to the range and precision of the long double type`; 1评估操作和类型的常量float和double向范围和精度double类型，评估long double运算和常量的范围和精度的long double type`;

2 evaluate all operations and constants to the range and precision of the long double type. 2评估long double类型的范围和精度的所有操作和常量。

C11dr §5.2.4.2.2 9 Except for assignment and cast (which remove all extra range and precision), the values yielded by operators with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type. C11dr§5.2.4.2.29除了赋值和强制转换（删除所有额外的范围和精度）之外，具有浮动操作数的运算符产生的值和通常算术转换以及浮动常量的值将被评估为其范围的格式和精度可能大于类型所要求的精度。 The use of evaluation formats is characterized by the implementation-defined value of FLT_EVAL_METHOD 评估格式的使用由FLT_EVAL_METHOD的实现定义值表征

-1 (Same as C99) -1（与C99相同）

0 (Same as C99) 0（与C99相同）

1 (Same as C99) 1（与C99相同）

2 (Same as C99) 2（与C99相同）

Answer 2

This is certainly a compiler bug. 这肯定是一个编译器错误。 From the C11 standard we have the following guarantees (C99 was similar): 从C11标准我们有以下保证（C99类似）：

Types have a set of representable values (implied) 类型具有一组可表示的值（隐含）
All values representable by float are also representable by double (6.2.5/10) float表示的所有值也可以表示为double （6.2.5 / 10）
Converting float to double does not change the value (6.3.1.5/1) 将float转换为double不会更改该值（6.3.1.5/1）
Casting int to float , when the int value is in the set of representable values for float , gives that value. 将int为float ，当int值在float的可表示值集合中时，给出该值。
Casting int to float , when the magnitude of the int value is less than FLT_MAX and the int is not a representable value for float , causes either the next-highest or next-lowest float value to be selected, and which one is selected is implementation-defined. 将int为float ，当int值的大小小于FLT_MAX并且int不是float的可表示值时，会导致选择下一个最高或下一个最低float值，以及选择哪一个是实现-defined。 (6.3.1.4/2) （6.3.1.4/2）

The third of these points guarantees that the float value supplied to printf is not modified by the default argument promotions. 这些点中的第三个保证提供给printf的float值不会被默认参数提升修改。

If 2147483647 is representable in float , then (float)x and (float)2147483647 must give 2147483647.000000 . 如果2147483647在float可表示，则(float)x和(float)2147483647必须给出2147483647.000000 。

If 2147483647 is not representable in float , then (float)x and (float)2147483647 must either give the next-highest or next-lowest float . 如果2147483647在float不可表示，则(float)x和(float)2147483647必须给出下一个最高或下一个最低float 。 They don't both have to make the same selection. 他们不必同时做出相同的选择。 But this means that a printout of 2147483647.000000 is not permitted ¹ , each must either be the higher value or the lower value. 但这意味着不允许打印输出2147483647.000000 ¹ ，每个都必须是更高的值或更低的值。

¹ Well - it's theoretically possible that the next-lowest float was 2147483646.9999999... so when the value is displayed with 6-digit precision by printf then it is rounded to give what was seen. ¹嗯 - 从理论上讲，下一个最低的浮点数可能是2147483646.9999999...所以当printf以6位精度显示该值时，它会被舍入以显示所看到的内容。 But this isn't true in IEEE754 and you could easily experiment to discount this possibility. 但在IEEE754中并非如此，您可以轻松地尝试折扣这种可能性。

Answer 3

On the first printf , the conversion from integer to float is done by the compiler. 在第一个printf ，编译器完成从整数到浮点的转换。 On the second one, it is done by the C runtime library. 在第二个，它由C运行时库完成。 There is no particular reason why they should produce answers identical at the limits of their precision. 没有特别的理由说明为什么他们应该在精度极限下产生相同的答案。

Answer 4

Visual C++ 6.0 was released last century, and I believe it predates standard C++. Visual C ++ 6.0于上个世纪发布，我相信它早于标准C ++。 It is wholly unsurprising that VC++ 6.0 exhibits broken behaviour. VC ++ 6.0表现出破碎的行为，这一点完全不足为奇。

You'll also note that gcc-3.4.2 is from 2004. Indeed, you're using a 32-bit compiler. 您还会注意到gcc-3.4.2是2004年的。实际上，您使用的是32位编译器。 gcc on x86 plays rather fast and loose with floating-point math . x86上的gcc 使用浮点数学运算速度相当快且松散。 This may technically be justified by the C standard if gcc sets FLT_EVAL_METHOD to something nonzero. 如果gcc将FLT_EVAL_METHOD设置为非零值，则技术上可以通过C标准证明这一点。

Answer 5

some of you guys said that it's a optimization bug, but i am kind of disagree. 你们中的一些人说这是一个优化错误，但我有点不同意。 i think it's a reasonable floating point precision error and a good example showing people how floating point works. 我认为这是一个合理的浮点精度误差，并且是一个向人们展示浮点如何工作的好例子。

http://ideone.com/Ssw8GR http://ideone.com/Ssw8GR

maybe OP could try to paste my program into your computer and try to compile with your compiler and see what happens. 也许OP可能会尝试将我的程序粘贴到您的计算机中，并尝试使用您的编译器进行编译，看看会发生什么。 or try: 或尝试：

http://ideone.com/OGypBC http://ideone.com/OGypBC

(with explicit float conversion). （使用显式浮点转换）。

anyway, if we calculate the error, it's 4.656612875245797e-10 that much, and should be considered as pretty precise. 无论如何，如果我们计算错误，那么它就是4.656612875245797e-10 ，应该被认为是非常精确的。

it could relate to the preference of printf too. 它也可能与printf的偏好有关。

将int转换为在C中浮动时的奇怪行为

问题描述

5 个解决方案

解决方案1
11 已采纳 2014-11-24 22:16:56

解决方案2
7 2014-11-24 22:58:37

解决方案3
2 2014-11-24 20:27:40

解决方案4
0 2014-11-24 22:24:50

解决方案5
-1 2014-11-24 22:28:58

将int转换为在C中浮动时的奇怪行为

问题描述

5 个解决方案

解决方案1 11 已采纳 2014-11-24 22:16:56

解决方案2 7 2014-11-24 22:58:37

解决方案3 2 2014-11-24 20:27:40

解决方案4 0 2014-11-24 22:24:50

解决方案5 -1 2014-11-24 22:28:58

解决方案1
11 已采纳 2014-11-24 22:16:56

解决方案2
7 2014-11-24 22:58:37

解决方案3
2 2014-11-24 20:27:40

解决方案4
0 2014-11-24 22:24:50

解决方案5
-1 2014-11-24 22:28:58