简体   繁体   English

将浮点二进制转换为十进制时,结果略有不准确

[英]Slightly inaccurate results when converting floating point binary to decimal

What it should do: Input example - 101.101 output - 5.625 它应该做什么:输入示例 - 101.101输出 - 5.625

This is the way I wrote a floating point binary - decimal converter but there is a small error - the problem is the output is not accurate to the correct decimal point. 这是我写一个浮点二进制 - 十进制转换器的方式,但有一个小错误 - 问题是输出不准确到正确的小数点。

What my code does: Input - 101.101 Output - 5.624985 我的代码做什么:输入 - 101.101输出 - 5.624985

What my code does when I changed count from -16 to -32: 当我将计数从-16更改为-32时,我的代码执行的操作:

Input - 101.101 Output - 5.625000 This is correct. 输入 - 101.101输出 - 5.625000这是正确的。

Input - 101.111 Output - 5.875061 This is still off it should be 5.875 输入 - 101.111输出 - 5.875061这仍然是关闭它应该是5.875

#include <stdio.h>

double decimal(double decpart);
long int integer(long int intpart);

int main(int argc, const char * argv[])
{

    double x;
    scanf("%lf", &x);
    long int intpart = (long int)x;
    double decpart = x-intpart;

    double finint = integer(intpart);
    double findec = decimal(decpart);

    double finnum = findec + finint;
    printf("%lf\n",finnum);

    return 0;
}

long int integer(long int intpart)
{
    double sum = 0;
    long int a, b, p= 0;

while(intpart>0)
{
    a = intpart % 10;
    b = a*(pow(2, p));
    sum = sum + b;
    p++;
    intpart = intpart / 10;
}
    return sum;
}

double decimal(double decpart)
{
    double sum = 0;
    int count = 0;
    while (decpart > 0 && count > -32)
    {
        count--;
        decpart = decpart*10;
        if (decpart >= 1)
        {
            decpart = decpart - 1;
            sum = sum + pow(2, count);
        }
    }

    return sum;
}

The inaccuracy is a rounding error built up from the pow function which almost always has a small error, even for integer arguments. 不准确性是由pow函数构建的舍入误差,即使对于整数参数,它几乎总是具有小的误差。 This is because pow(x, y) is often implemented based on the mathematical identity as exp(log(x) * y) , where log and exp use the natural base 2.718281828... . 这是因为pow(x, y)通常基于数学标识实现为exp(log(x) * y) ,其中logexp使用自然基数2.718281828... Thus, even when eg the base is 2, log(2) is an approximation, so exp(log(2)) will be even more of an approximation. 因此,即使例如基数是2, log(2)也是近似值,因此exp(log(2))将更接近于近似值。

In your situation, rather than using count and pow , you can have a double value field that starts off at 0.5 , and is multiplied by 0.5 after each iteration: 在您的情况下,您可以使用从0.5开始的double value字段,而不是使用countpow ,并在每次迭代后乘以0.5

double decimal(double decpart)
{
    double sum = 0;
    double value = 0.5;
    while (decpart > 0 && value > 1.0e-5) // approx. 2 ^ -16
    {
        decpart = decpart*10;
        printf("%lf\n",decpart);
        if (decpart > 1)
        {
            decpart = decpart - 1;
            sum = sum + value;
        }
        value *= 0.5;
    }

    return sum;
}

In general, this will be more accurate than the pow alternative. 通常,这将比pow替代品更准确。 On IEEE-754 compliant systems (most modern systems are), value should always be the exact value you want. 在符合IEEE-754标准的系统(大多数现代系统)上, value 始终是您想要的精确值。


Further, as I/others have mentioned, using scanf to read in the input as a double instead of a string also leads to inaccuracies, as numbers like 0.1 often cannot be stored exactly. 此外,正如我/其他人所提到的,使用scanf将输入读入为double而不是字符串也会导致不准确,因为像0.1这样的数字通常无法准确存储。 Instead, you should input to a char array, then parse the string. 相反,您应该输入一个char数组,然后解析该字符串。

The problem is -16 which is only 1 part in 65,536 or (0.0000153...). 问题是-16 ,这只是65,536或(0.0000153 ......)中的1个部分。 The answer you get and desire are within that range. 你得到的答案和愿望都在这个范围内。 Instead, need a more negative value like -32 or -53. 相反,需要一个更负的值,如-32或-53。 (or about ln2(DBL_EPSILON) ) - (或关于ln2(DBL_EPSILON) ) -

[Edit2] Values like -17, -18 , etc have additional problems. [Edit2] -17,-18等值也存在其他问题。 see below. 见下文。

Also if (decpart > 1) --> if (decpart >= 1) . if (decpart > 1) - > if (decpart >= 1)


[Edit] [编辑]

Per the C spec with DBL_MIN_10_EXP at most -37 and typical binary floating point, a reasonable pow(2, count) will provide exact answers for count int the range -80 to +80 . 每C时的参数与DBL_MIN_10_EXP 至多 -37和典型二进制浮点数,合理pow(2, count)将提供确切的答案为count的int范围-80+80

Your method of reading a decimal number and treating like a binary FP number likely breaks down once N significant digits are entered ("101.101" being 6). 一旦输入N位有效数字(“101.101”为6),您读取十进制数字并将其视为二进制FP编号的方法可能会中断。 Expect N to be something like 1/DBL_EPSILON or at least 8 or 9 digits. 期望N类似于1/DBL_EPSILON或至少8或9位数。 To get beyond that limit, suggest @Drew McGowen advice and read and process your input as a string. 要超出该限制,请建议@Drew McGowen建议,并将输入作为字符串读取和处理。

[Edit2] [EDIT2]

Given a typical double the limit of N significant digits is about 16 or 17. Not only does this limit the input, it also limits the number of iterations in the while (decpart > 0 && count > -16) . 给出一个典型的doubleN显著位数为约16或17这不仅限制输入,它限制了迭代次数while (decpart > 0 && count > -16) Going much deeper than that, the string to FP conversion of "101.111" (which is more like 101.111000000000004... ) yield unexpected results, acts like 101.111000000000001111111... 比这更深入,FP转换为“101.111”(更像是101.111000000000004... )的字符串产生了意想不到的结果,表现得像101.111000000000001111111...

(Mathematically correct 101 + 1*1/2 + 1*1/4 + 1*1/8 + 1*pow(2,-15) + 1*pow(2,-16))) ... = 5.875061 (数学上正确101 + 1 * 1/2 + 1 * 1/4 + 1 * 1/8 + 1 * pow(2,-15)+ 1 * pow(2,-16)))... = 5.875061

So..... Iterating decimal() more than log10(1/DBL_EPSILON) or about 15,16 times, begins to generate crap. 所以.....迭代decimal()多于log10(1/DBL_EPSILON)或大约15,16次,开始生成废话。 Yet code iterating 16 times only provides a decimal precision of 1 part in 65,536 (0.000015...). 然而,迭代16次的代码仅提供65,536(0.000015 ...)中1个部分的小数精度。 Therefore to get answers better than that a new approach (like a string @Drew McGowen, inspired by @BLUEPIXY) is needed. 因此,为了获得更好的答案,需要一种新的方法(如字符串@Drew McGowen,受@BLUEPIXY启发)。

double BinaryFloatinPoint(const char *s) {
  double sum = 0.0;
  double power = 1.0;
  char dp = '.';  // binary radix point
  while (*s) {
    if (*s == '0' || *s == '1') {
      sum *= 2.0;
      sum += *s - '0';
      power *= 0.5;
    } else if (*s == dp) {
      dp = '0';
      power = 1.0;
    } else {
      return 0.0; // Unexpected char, maybe return NAN instead
    }
    s++;
  }
  if (dp == '0') {  // If dp found ...
    sum *= power;
  }
  return sum;
}

Can not be expressed exactly because the number, such as 0.1 (10) is an infinite decimal in binary. 无法准确表达,因为数字,例如0.1 (10)是二进制的无限小数。
So I suggest that you convert the input as a string. 所以我建议你将输入转换为字符串。

#include <stdio.h>
#include <string.h>

double bstrtod(const char *bstr){
    double x = 0, one = 1.0;
    char *p = strchr(bstr, '.');
    if(p){
        char *fp = p;
        while(*++fp){
            one /= 2.0;
            if(*fp=='1')
                x += one;
        }
    } else {
        p = strchr(bstr, '\0');
    }
    one = 1.0;
    do{
        if(*--p == '1')
            x += one;
        one *= 2.0;
    }while(p!=bstr);

    return x;
}

int main(void){
    double x = bstrtod("101.101");
    printf("%f\n", x);//5.625000
    return 0;
}

This inaccuracy comes from errors during rounding off. 这种不准确性来自四舍五入期间的错误。 Even if you input 8, it actually stores 7.99999... That's why the problem occurs. 即使您输入8,它实际上存储了7.99999 ...这就是问题发生的原因。

Accurate Floating Point Binary to Decimal Converter 精确浮点二进制到十进制转换器

This code uses strings to convert a binary input to a decimal output 此代码使用字符串将二进制输入转换为十进制输出

#include <stdio.h>
#include <string.h>

double blembo(const char *blem);

int main(void)
{
    char input[50];
    gets(input);

    int t = 0, flag = 0;
    while (input[t] != '\0')  // This whole block of code invalidates "hello" or "921" or "1.1.0" inputs
    {
        if(input[t] == '0' || input[t] == '1' || input[t] == '.')
        {
            if(input[t] == '.')
            {
                if(flag != 1)
                {
                    flag = 1;
                    t++;
                    continue;
                }
                else
                {
                    printf("\nIncorrect input\n");
                    return 0;
                }
            }
            else
            {
                t++;
                continue;
            }
        }
        else
        {
            printf("\nIncorrect input\n");
            return 0;
        }
    }

    double output = blembo(input);
    printf("%lf\n", output);
    return 0;
}

double blembo (const char *blem)
{
    double x=0, one = 1.0;
    char *p , *fp;
    p = strchr(blem, '.');
    if(p)
    {
        fp = p;
        while(*++fp)
        {
            one = one / 2.0;
            if(*fp == '1')
            {
                x = x + one;
            }
        }
    }
        else
        {
            p = strchr(blem, '\0');
        }

    one = 1.0;
    do
    {
        if(*--p == '1')
        {
            x = x + one;
        }

        one = one * 2.0;
    }
    while(p!=blem);

    return x;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM