printing the integral part of a floating point number

Question

I am trying to figure out how to print floating point numbers without using library functions. Printing the decimal part of a floating point number turned out to be quite easy. Printing the integral part is harder:

static const int base = 2;
static const char hex[] = "0123456789abcdef";

void print_integral_part(float value)
{
    assert(value >= 0);
    char a[129]; // worst case is 128 digits for base 2 plus NUL
    char * p = a + 128;
    *p = 0;
    do
    {
        int digit = fmod(value, base);
        value /= base;
        assert(p > a);
        *--p = hex[digit];
    } while (value >= 1);
    printf("%s", p);
}

Printing the integral part of FLT_MAX works flawlessly with base 2 and base 16:

11111111111111111111111100000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000 (base 2)

ffffff00000000000000000000000000 (base 16)

However, printing in base 10 results in errors after the first 7 digits:

340282368002860660002286082464244022240 (my own function)
340282346638528859811704183484516925440 (printf)

I assume this is a result of the division by 10. It gets better if I use double instead of float:

340282346638528986604286022844204804240 (my own function)
340282346638528859811704183484516925440 (printf)

(If you don't believe printf , enter 2^128-2^104 into Wolfram Alpha. It is correct.)

Now, how does printf manage to print the correct result? Does it use some bigint facilities internally? Or is there some floating point trick I am missing?

Answer 1

I believe the problem lies in value /= base; . Do not forget that 10 is not a finite fraction in binary system and thus this calculation is never correct. I also assume some error will occur in fmod due to the same reason.

printf will first compute the integral part and then convert it to decimal (if I get the way you printf the integral part correctly).

Answer 2

/Edit: Read Unni's answer first. This results come from http://codepad.org/TLqQzLO3 .

void print_integral_part(float value)
{
    printf("input : %f\n", value);
    char a[129]; // worst case is 128 digits for base 2 plus NUL
    char * p = a + 128;
    *p = 0;
    do
    {
        int digit = fmod(value, base);
        value /= base;
        printf("interm: %f\n", value);
        *--p = hex[digit];
    } while (value >= 1);
    printf("result: %s\n", p);
}

print_integral_part(3.40282347e+38F);

to see how messed up your value gets by the value /= base operation:

input : 340282346638528859811704183484516925440.000000
interm: 34028234663852885981170418348451692544.000000
interm: 3402823466385288480057879763104038912.000000
interm: 340282359315034876851393457419190272.000000
interm: 34028234346940236846450271659753472.000000
interm: 3402823335658820218996583884128256.000000
interm: 340282327376181848531187106054144.000000
interm: 34028232737618183051678859657216.000000
interm: 3402823225404785588136713388032.000000
interm: 340282334629736780292710989824.000000
interm: 34028231951816403862828351488.000000
interm: 3402823242405304929106264064.000000
interm: 340282336046446683592065024.000000
interm: 34028232866774907300610048.000000
interm: 3402823378911210969759744.000000
interm: 340282332126513595416576.000000
interm: 34028233212651357863936.000000
interm: 3402823276229139890176.000000
interm: 340282333252413489152.000000
interm: 34028234732616232960.000000
interm: 3402823561222553600.000000
interm: 340282356122255360.000000
interm: 34028235612225536.000000
interm: 3402823561222553.500000
interm: 340282366859673.625000
interm: 34028237357056.000000
interm: 3402823735705.600098
interm: 340282363084.799988
interm: 34028237619.200001
interm: 3402823680.000000
interm: 340282368.000000
interm: 34028236.800000
interm: 3402823.600000
interm: 340282.350000
interm: 34028.234375
interm: 3402.823438
interm: 340.282349
interm: 34.028235
interm: 3.402824
interm: 0.340282
result: 340282368002860660002286082464244022240

When in doubt, throw more printfs at it ;)

Answer 3

According to IEEE single precision float implementation, only 24 bits of data is stored at any time in a float variable. This means only maximum 7 decimal digits are stored in the floating number.

Rest of the hugeness of the number is stored in the exponent. FLT_MAX is initialized as 3.402823466e+38F. So, after the 10th precision, which digit should get printed is not defined anywhere.

From Visual C++ 2010 compiler, I get this output 340282346638528860000000000000000000000.000000, which is the only vaild output.

So, initially we have these many valid digits 3402823466 So after the 1st division we have only 0402823466 So, the system need to get rid of the left 0 and introduce a new digit at the right. In ideal integer division, it is 0. Because you are doing floating division (value /= base;) , system is getting some other digit to fill in that location.

So, in my opinion, the printf could be assigning the above available significant digits to an integer and working with this.

Answer 4

It appears that the work horse for the float to string conversion is the dtoa() function. See dtoa.c in newlib for how they do it.

Now, how does printf manage to print the correct result?

I think it is close to magic. At least the source looks like some kind of dark incantation.

Does it use some bigint facilities internally?

Yes, search for _Bigint in the linked source file.

Or is there some floating point trick I am missing?

Likely.

Answer 5

Let's explain this one more time. After the integer part has been printed (exactly) without any rounding other than chop towards 0 it's time for the decimal bits.

Start with a string of bytes (say 100 for starters) containing binary zeros. If the first bit to the right of the decimal point in the fp value is set that means that 0.5 (2^-1 or 1/(2^1)is a component of the fraction. So add 5 to the first byte. If the next bit is set 0.25 (2^-2 or 1/(2^2)) is part of the fraction add 5 to the second byte and add 2 to the first (oh, don't forget the carry, they happen - lower school math). The next bit set means 0.125 so add 5 to the third byte, 2 to the second and 1 to the first. And so on:

      value          string of binary 0s
start 0              0000000000000000000 ...
bit 1 0.5            5000000000000000000 ...
bit 2 0.25           7500000000000000000 ...
bit 3 0.125          8750000000000000000 ...
bit 4 0.0625         9375000000000000000 ...
bit 5 0.03125        9687500000000000000 ...
bit 6 0.015625       9843750000000000000 ...
bit 7 0.0078125      9921875000000000000 ...
bit 8 0.00390625     9960937500000000000 ...
bit 9 0.001953125    9980468750000000000 ...
...

I did this by hand so I may have missed something but to implement this in code is trivial.

So for all those SO "can't get an exact result using float" people who don't know what they're talking about here is proof that floating point fraction values are perfectly exact. Excruciatingly exact. But binary.

For those who take the time to get their heads around how this works, better precision is well within reach. As for the others ... well I guess they'll keep on not browsing the fora for the answer to a question which has been answered numerous times previously, honestly believe they have discovered "broken floating point" (or whatever thay call it) and post a new variant of the same question every day.

"Close to magic," "dark incantation" - that's hilarious!

Answer 6

Like Agent_L's answer, you're suffering from the false result caused by dividing the value by 10. Float, like any binary floating point type, cannot express correctly most rational number in decimal. After division, most of the case the result cannot be fitted into binary, so it'll be rounded. Hence the more you divide, the more error you'll realize.

If the number is not very large, a quick solution would be multiplying it with 10 or a power of 10 depending on how many digits after decimal point you need.

Another way was described here

Answer 7

This program will work for you.

#include<stdio.h>
int main()
{
    float num;
    int z;
    scanf("%f",&num);
    z=(int)num;
    printf("the integral part of the floating point number is %d",z);
}

printing the integral part of a floating point number

Question

6 answers

solution1
2 2012-06-06 10:34:35

solution2
2 2012-06-06 10:45:49

solution3
2 2012-06-06 11:17:47

solution4
1 ACCPTED 2012-06-06 10:50:33

solution5
1 2012-06-07 18:48:44

solution6
0 2013-10-05 11:57:17

solution7
0 2018-08-12 05:25:11

printing the integral part of a floating point number

Question

6 answers

solution1 2 2012-06-06 10:34:35

solution2 2 2012-06-06 10:45:49

solution3 2 2012-06-06 11:17:47

solution4 1 ACCPTED 2012-06-06 10:50:33

solution5 1 2012-06-07 18:48:44

solution6 0 2013-10-05 11:57:17

solution7 0 2018-08-12 05:25:11

solution1
2 2012-06-06 10:34:35

solution2
2 2012-06-06 10:45:49

solution3
2 2012-06-06 11:17:47

solution4
1 ACCPTED 2012-06-06 10:50:33

solution5
1 2012-06-07 18:48:44

solution6
0 2013-10-05 11:57:17

solution7
0 2018-08-12 05:25:11