In this example, the behaviour of floor
differs and I do not understand why:
printf("floor(34000000.535 * 100 + 0.5) : %lf \n", floor(34000000.535 * 100 + 0.5));
printf("floor(33000000.535 * 100 + 0.5) : %lf \n", floor(33000000.535 * 100 + 0.5));
The output for this code is:
floor(34000000.535 * 100 + 0.5) : 3400000053.000000
floor(33000000.535 * 100 + 0.5) : 3300000054.000000
Why does the first result not equal to 3400000054.0 as we could expect?
double
in C does not represent every possible number that can be expressed in text.
double
can typically represent about 2 64 different numbers. Neither 34000000.535
nor 33000000.535
are in that set when double
is encoded as a binary floating point number. Instead the closest representable number is used.
Text 34000000.535
closest double 34000000.534999996423...
Text 33000000.535
closest double 33000000.535000000149...
With double
as a binary floating point number, multiplying by a non-power-of-2, like 100.0, can introduce additional rounding differences. Yet in these cases, it still results in products, one just above xxx.5 and another below.
Adding 0.5
, a simple power of 2, does not incurring rounding issues as the value is not extreme compared to 3x00000053.5.
Seeing intermediate results to higher print precision well shows the typical step-by-step process.
#include <stdio.h>
#include <float.h>
#include <math.h>
void fma_test(double a, double b, double c) {
int n = DBL_DIG + 3;
printf("a b c %.*e %.*e %.*e\n", n, a, n, b, n, c);
printf("a*b %.*e\n", n, a*b);
printf("a*b+c %.*e\n", n, a*b+c);
printf("a*b+c %.*e\n", n, floor(a*b+c));
puts("");
}
int main(void) {
fma_test(34000000.535, 100, 0.5);
fma_test(33000000.535, 100, 0.5);
}
Output
a b c 3.400000053499999642e+07 1.000000000000000000e+02 5.000000000000000000e-01
a*b 3.400000053499999523e+09
a*b+c 3.400000053999999523e+09
a*b+c 3.400000053000000000e+09
a b c 3.300000053500000015e+07 1.000000000000000000e+02 5.000000000000000000e-01
a*b 3.300000053500000000e+09
a*b+c 3.300000054000000000e+09
a*b+c 3.300000054000000000e+09
The issue is more complex then this simple answers as various platforms can 1) use higher precision math like long double
or 2) rarely, use a decimal floating point double
. So code's results may vary.
Question has been already answered here .
In basic float numbers are just approximation. If we have program like this:
float a = 0.2 + 0.3;
float b = 0.25 + 0.25;
if (a == b) {
//might happen
}
if (a != b) {
// also might happen
}
The only guaranteed thing is that ab
is relatively small.
Using the code that shows the representation of floats in memory as sum of terms , we get:
main()
{
float x=floor(34000000.535 * 100 + 0.5);
float y=floor(33000000.535 * 100 + 0.5);
xx(&x);
xx(&y);
yy(x);
yy(y);
}
This code will output the representation in memory of the values returned by floor in both cases.
Using the bc
calcultor, we can see that the approximation is indeed good, but there are some perturbations due to math behind floor representation.
Note: I did set scale=20
in bc
, which means, each intermediary computation keeps 20 digits after the point.
./a.out
1ST NUMBER=> sign:0 exponent:1 0 0 1 1 1 1 fraction:0 1 0 0 1 0 1 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 1 0
2ND NUMBER=> sign:0 exponent:1 0 0 1 1 1 1 fraction:0 1 0 0 0 1 0 0 1 0 1 1 0 0 1 0 0 0 0 0 0 0 0 1
1ST NUMBER=> positive ( 1+1/(2) +1/(16) +1/(64) +1/(256) +1/(1024) +1/(8192) +1/(16384) +1/(32768) +1/(65536) +1/(131072) +1/(262144) +1/(4194304) )*2^31
2ND NUMBER=> positive ( 1+1/(2) +1/(32) +1/(256) +1/(1024) +1/(2048) +1/(16384) +1/(8388608) )*2^31
@ bc
scale=20
( 1+1/(2) +1/(16) +1/(64) +1/(256) +1/(1024) +1/(8192) +1/(16384) +1/(32768) +1/(65536) +1/(131072) +1/(262144) +1/(4194304) )*2^31
3399999999.99999999999463129088
( 1+1/(2) +1/(32) +1/(256) +1/(1024) +1/(2048) +1/(16384) +1/(8388608) )*2^31
3299999999.99999999999731564544
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.