在c ++中混淆了“double to long long”

Question

my code: 我的代码：

int main()
{
long long a = pow(2,63) - 1;
long long b = pow(2,63);
double c  = pow(2,63) - 1;
double d = pow(2,63);
printf("%lld %lld \n%f %f \n%lld %lld\n", a, b, c, d, (long long)c, (long long)d);

return 0;
}

and the excute result is (codeblock with gcc in win7 x64): 并且执行结果是（在win7 x64中使用gcc的代码块）：

9223372036854775807 9223372036854775807
9223372036854775800.000000 9223372036854775800.000000
-9223372036854775808 -9223372036854775808

Question: 题：

Why a == b ? 为什么a == b ？

I know that c == d because of the precision of double . 我知道c == d因为double的精度。

But why (long long)c and (long long)d is not 9223372036854775800 ? 但为什么(long long)c和(long long)d不是9223372036854775800 ？

And why (long long)c != a and (long long)d != b ? 为什么(long long)c != a和(long long)d != b ？

Answer 1

pow(2,63) - 1 is all done in double-precision floating point arithmetic. pow(2,63) - 1全部采用双精度浮点运算。 In particular, the -1 is converted into -1.0 and that is too small to matter 特别是， -1被转换为-1.0并且太小而无关紧要

Answer 2

why a == b ? 为什么a == b ？ I know that c == d because of the precision of double. 我知道c == d因为double的精度。

For exactly the same reason. 出于完全相同的原因。 There are no overloads of pow for integer types, so the arithmetic is done using double . 整数类型没有pow重载，因此算术使用double完成。 Since double typically has 52 bits of significance, adding or subtracting 1 to a value as large as 2 ⁶³ will have no effect. 由于double典型地具有显着性的52个比特，加上或减去1到大至2 ⁶³不会有任何效果的值。

why (long long)c and (long long)d is not 9223372036854775800 ? 为什么(long long)c和(long long)d不是9223372036854775800 ？

Because long long is a 64-bit signed type, and the maximum representable value is 2 ⁶³ -1. 因为long long是64位有符号类型，并且最大可表示值是2 ⁶³ -1。 c and d might both have the value 2 ⁶³ (or even a slightly larger value), which is out of range. c和d可能都具有值2 ⁶³ （或甚至略大的值），这超出范围。 On a typical 2s-complement platform, this is likely to overflow to give a value around -2 ⁶³ , as you observe. 在一个典型的2s补码平台上，正如你所观察到的那样，这可能会溢出以产生大约-2 ⁶³的值。 But note that this is undefined behaviour; 但请注意，这是未定义的行为; you cannot rely on anything if a floating point conversion overflows. 如果浮点转换溢出，你不能依赖任何东西。

why (long long)c != a and (long long)d != b ? 为什么(long long)c != a和(long long)d != b ？

I don't know; 我不知道; for me, a and b have the same large negative values. 对我来说， a和b具有相同的大负值。 It looks like some quirk of your implementation caused a and b to end up with the value 2 ⁶³ -1 rather than the expected 2 ⁶³ . 看起来你实现的一些怪癖导致a和b最终得到值2 ⁶³ -1而不是预期的2 ⁶³ 。 As always when dealing with floating-point numbers, you should expect small rounding errors like that. 与处理浮点数一样，您应该期待像这样的小舍入误差。

You could get the exact result by using integer arithmetic: 您可以使用整数运算得到确切的结果：

long long a = (1ULL << 63) - 1;
unsigned long long b = 1ULL << 63;

Note the use of unsigned arithmetic since, as mentioned above, the signed (1LL << 63) would overflow. 注意使用无符号算术，因为如上所述，signed (1LL << 63)会溢出。

Answer 3

why a == b 为什么a == b

Because your compiler (gcc) calculated the values to initialize a and b with, and found (proved ?) both were matching or exceeding the maximum possible value for a long long , so it initialized both with that maximum value LLONG_MAX (or 0x7FFFFFFFFFFFFFFF , or 9223372036854775807 on your platform). 因为你的编译器（gcc）计算了初始化a和b的值，并且发现（证明了？）两者都匹配或超过了long long的最大可能值，所以它用最大值LLONG_MAX （或0x7FFFFFFFFFFFFFFF ，或者9223372036854775807在您的平台上）。

Note that (as pointed out by Pascal Cuoq) this is undefined behaviour, caused by an overflow while converting a double to a long long when initializing a and b . 请注意（正如Pascal Cuoq所指出的）这是未定义的行为，在初始化a和b时将double转换为long long时会导致溢出。 While gcc deals with this as described above, other compilers can deal with this differently 虽然gcc如上所述处理此问题，但其他编译器可以不同地处理此问题

I know that c ==d because of the precision of double 我知道c == d因为double的精度

The reason c and d hold the same value is indeed because of the precision of a double : c和d保持相同值的原因确实是因为double的精度：

pow(2, 63) can be accurately represented with fraction 1 and exponent 63 pow(2, 63)可以用分数1和指数63精确表示
pow(2, 63) - 1 cannot be accurately represented pow(2, 63) - 1无法准确表示

The reason it's not showing 9223372036854775808 (the precise value stored in c and d ), is because of the printf precision, which on your platform apparently only shows 17 digits. 它没有显示9223372036854775808 （存储在c和d的精确值）的原因是因为printf精度，在你的平台上显然只显示17位数。 You might be able to force it to show more using eg. 您可能可以强制它使用例如显示更多。 %20.0f , but on Windows that will likely not make a difference due to this bug . %20.0f ，但在Windows上可能由于此错误而无法发挥作用。

why (long long)c and (long long)d is not 9223372036854775800 ? 为什么（长期）c和（长期）d不是9223372036854775800？

Because c and d hold the value 9223372036854775808 , or 0x8000000000000000 , which when printed as a signed value becomes -9223372036854775808 . 因为c和d保持值9223372036854775808或0x8000000000000000 ，当打印为有符号值时变为-9223372036854775808 。

Note that this is again undefined behaviour (due to signed overflow). 请注意，这又是未定义的行为（由于签名溢出）。

why (long long)c != a and (long long)d != b? 为什么（很长）c！= a和（long long）d！= b？

Because they were calculated in different ways. 因为它们是以不同的方式计算的。 a and b were calculated by the compiler, while (long long) c and (long long) d were calculated at runtime. a和b由编译器计算，而(long long) c和(long long) d在运行时计算。

While normally, these different ways of calculating should yield the same results, we're dealing with undefined behaviour here (as explained earlier), so anything goes. 虽然通常情况下，这些不同的计算方法应该产生相同的结果，但我们在这里处理未定义的行为（如前所述），所以任何事情都会发生。 And in your case, the compiler's results are different from the runtime results. 在您的情况下，编译器的结果与运行时结果不同。

Answer 4

Because pow returns a double and double lost precisions. 因为pow返回双倍和双倍的精度。 That's why a==b . 这就是为什么a==b 。

Answer 5

pow(2, 63) is equivalent to pow((double) 2, (double) 63) . pow(2, 63)相当于pow((double) 2, (double) 63) 。

Indeed, C++11 26.8 [c.math] paragraph 3 says that <cmath> provides the declaration of double pow(double, double) and paragraph 11 says that (emphasis mine) 事实上，C ++ 11 26.8 [c.math]第3段说<cmath>提供double pow(double, double)的声明，第11段表示（强调我的）

If any argument corresponding to a double parameter has type long double, then all arguments corresponding to double parameters are effectively cast to long double. 如果对应于double参数的任何参数的类型为long double，则对应于double参数的所有参数都有效地转换为long double。

Otherwise, if any argument corresponding to a double parameter has type double or an integer type, then all arguments corresponding to double parameters are effectively cast to double . 否则， 如果对应于任何参数double参数的类型是double或整数类型，则对应于所有的参数double参数被有效地转换为double 。

Otherwise, all arguments corresponding to double parameters are effectively cast to float. 否则，对应于double参数的所有参数都被有效地转换为float。

Now, the literals 2 and 63 are int s, therefore, pow(2, 63) is equivalent to pow((double) 2, (double) 63) . 现在，文字2和63是int ，因此， pow(2, 63)相当于pow((double) 2, (double) 63) 。 The returning type is then double which doesn't have 63 bits of precision required to "see" the difference between 2^63 and 2^63 - 1 . 然后返回类型是double ，它没有“看到” 2^63和2^63 - 1之间的差异所需的63位精度。

I recommend the reading of this post and the excelent answer by Howard Hinnant. 我推荐阅读这篇文章以及Howard Hinnant的优秀答案。

Answer 6

long long -> %lld 长 - >％lld

long double ->%Lf 长双 - >％Lf

double -> %f double - >％f

float -> %f float - >％f

int -> %d int - >％d

Read Chapter 15 in << POINTERS on C >> for more details. 有关详细信息，请阅读“C中的指针”中的第15章。

在c ++中混淆了“double to long long”

问题描述

6 个解决方案

解决方案1
4 2013-10-08 14:13:29

解决方案2
2 已采纳 2013-10-08 14:57:08

解决方案3
2 2013-10-08 15:07:55

解决方案4
1 2013-10-08 14:13:01

解决方案5
0 2013-10-08 14:14:52

解决方案6
0 2013-10-09 10:02:26

在c ++中混淆了“double to long long”

问题描述

6 个解决方案

解决方案1 4 2013-10-08 14:13:29

解决方案2 2 已采纳 2013-10-08 14:57:08

解决方案3 2 2013-10-08 15:07:55

解决方案4 1 2013-10-08 14:13:01

解决方案5 0 2013-10-08 14:14:52

解决方案6 0 2013-10-09 10:02:26

解决方案1
4 2013-10-08 14:13:29

解决方案2
2 已采纳 2013-10-08 14:57:08

解决方案3
2 2013-10-08 15:07:55

解决方案4
1 2013-10-08 14:13:01

解决方案5
0 2013-10-08 14:14:52

解决方案6
0 2013-10-09 10:02:26