[英]Slightly inaccurate results when converting floating point binary to decimal
What it should do: Input example - 101.101 output - 5.625 它应该做什么:输入示例 - 101.101输出 - 5.625
This is the way I wrote a floating point binary - decimal converter but there is a small error - the problem is the output is not accurate to the correct decimal point. 这是我写一个浮点二进制 - 十进制转换器的方式,但有一个小错误 - 问题是输出不准确到正确的小数点。
What my code does: Input - 101.101 Output - 5.624985 我的代码做什么:输入 - 101.101输出 - 5.624985
What my code does when I changed count from -16 to -32: 当我将计数从-16更改为-32时,我的代码执行的操作:
Input - 101.101 Output - 5.625000 This is correct. 输入 - 101.101输出 - 5.625000这是正确的。
Input - 101.111 Output - 5.875061 This is still off it should be 5.875 输入 - 101.111输出 - 5.875061这仍然是关闭它应该是5.875
#include <stdio.h>
double decimal(double decpart);
long int integer(long int intpart);
int main(int argc, const char * argv[])
{
double x;
scanf("%lf", &x);
long int intpart = (long int)x;
double decpart = x-intpart;
double finint = integer(intpart);
double findec = decimal(decpart);
double finnum = findec + finint;
printf("%lf\n",finnum);
return 0;
}
long int integer(long int intpart)
{
double sum = 0;
long int a, b, p= 0;
while(intpart>0)
{
a = intpart % 10;
b = a*(pow(2, p));
sum = sum + b;
p++;
intpart = intpart / 10;
}
return sum;
}
double decimal(double decpart)
{
double sum = 0;
int count = 0;
while (decpart > 0 && count > -32)
{
count--;
decpart = decpart*10;
if (decpart >= 1)
{
decpart = decpart - 1;
sum = sum + pow(2, count);
}
}
return sum;
}
The inaccuracy is a rounding error built up from the pow
function which almost always has a small error, even for integer arguments. 不准确性是由
pow
函数构建的舍入误差,即使对于整数参数,它几乎总是具有小的误差。 This is because pow(x, y)
is often implemented based on the mathematical identity as exp(log(x) * y)
, where log
and exp
use the natural base 2.718281828...
. 这是因为
pow(x, y)
通常基于数学标识实现为exp(log(x) * y)
,其中log
和exp
使用自然基数2.718281828...
Thus, even when eg the base is 2, log(2)
is an approximation, so exp(log(2))
will be even more of an approximation. 因此,即使例如基数是2,
log(2)
也是近似值,因此exp(log(2))
将更接近于近似值。
In your situation, rather than using count
and pow
, you can have a double value
field that starts off at 0.5
, and is multiplied by 0.5
after each iteration: 在您的情况下,您可以使用从
0.5
开始的double value
字段,而不是使用count
和pow
,并在每次迭代后乘以0.5
:
double decimal(double decpart)
{
double sum = 0;
double value = 0.5;
while (decpart > 0 && value > 1.0e-5) // approx. 2 ^ -16
{
decpart = decpart*10;
printf("%lf\n",decpart);
if (decpart > 1)
{
decpart = decpart - 1;
sum = sum + value;
}
value *= 0.5;
}
return sum;
}
In general, this will be more accurate than the pow
alternative. 通常,这将比
pow
替代品更准确。 On IEEE-754 compliant systems (most modern systems are), value
should always be the exact value you want. 在符合IEEE-754标准的系统(大多数现代系统)上,
value
应始终是您想要的精确值。
Further, as I/others have mentioned, using scanf
to read in the input as a double
instead of a string also leads to inaccuracies, as numbers like 0.1 often cannot be stored exactly. 此外,正如我/其他人所提到的,使用
scanf
将输入读入为double
而不是字符串也会导致不准确,因为像0.1这样的数字通常无法准确存储。 Instead, you should input to a char
array, then parse the string. 相反,您应该输入一个
char
数组,然后解析该字符串。
The problem is -16
which is only 1 part in 65,536 or (0.0000153...). 问题是
-16
,这只是65,536或(0.0000153 ......)中的1个部分。 The answer you get and desire are within that range. 你得到的答案和愿望都在这个范围内。 Instead, need a more negative value like -32 or -53.
相反,需要一个更负的值,如-32或-53。 (or about
ln2(DBL_EPSILON)
) - (或关于
ln2(DBL_EPSILON)
) -
[Edit2] Values like -17, -18 , etc have additional problems. [Edit2] -17,-18等值也存在其他问题。 see below.
见下文。
Also if (decpart > 1)
--> if (decpart >= 1)
. if (decpart > 1)
- > if (decpart >= 1)
。
[Edit] [编辑]
Per the C spec with DBL_MIN_10_EXP
at most -37
and typical binary floating point, a reasonable pow(2, count)
will provide exact answers for count
int the range -80
to +80
. 每C时的参数与
DBL_MIN_10_EXP
至多 -37
和典型二进制浮点数,合理pow(2, count)
将提供确切的答案为count
的int范围-80
至+80
。
Your method of reading a decimal number and treating like a binary FP number likely breaks down once N significant digits are entered ("101.101" being 6). 一旦输入N位有效数字(“101.101”为6),您读取十进制数字并将其视为二进制FP编号的方法可能会中断。 Expect
N
to be something like 1/DBL_EPSILON
or at least 8 or 9 digits. 期望
N
类似于1/DBL_EPSILON
或至少8或9位数。 To get beyond that limit, suggest @Drew McGowen advice and read and process your input as a string. 要超出该限制,请建议@Drew McGowen建议,并将输入作为字符串读取和处理。
[Edit2] [EDIT2]
Given a typical double
the limit of N
significant digits is about 16 or 17. Not only does this limit the input, it also limits the number of iterations in the while (decpart > 0 && count > -16)
. 给出一个典型的
double
限N
显著位数为约16或17这不仅限制输入,它也限制了迭代次数while (decpart > 0 && count > -16)
Going much deeper than that, the string to FP conversion of "101.111" (which is more like 101.111000000000004...
) yield unexpected results, acts like 101.111000000000001111111...
比这更深入,FP转换为“101.111”(更像是
101.111000000000004...
)的字符串产生了意想不到的结果,表现得像101.111000000000001111111...
(Mathematically correct 101 + 1*1/2 + 1*1/4 + 1*1/8 + 1*pow(2,-15) + 1*pow(2,-16))) ... = 5.875061
(数学上正确101 + 1 * 1/2 + 1 * 1/4 + 1 * 1/8 + 1 * pow(2,-15)+ 1 * pow(2,-16)))... = 5.875061
So..... Iterating decimal()
more than log10(1/DBL_EPSILON)
or about 15,16 times, begins to generate crap. 所以.....迭代
decimal()
多于log10(1/DBL_EPSILON)
或大约15,16次,开始生成废话。 Yet code iterating 16 times only provides a decimal precision of 1 part in 65,536 (0.000015...). 然而,迭代16次的代码仅提供65,536(0.000015 ...)中1个部分的小数精度。 Therefore to get answers better than that a new approach (like a string @Drew McGowen, inspired by @BLUEPIXY) is needed.
因此,为了获得更好的答案,需要一种新的方法(如字符串@Drew McGowen,受@BLUEPIXY启发)。
double BinaryFloatinPoint(const char *s) {
double sum = 0.0;
double power = 1.0;
char dp = '.'; // binary radix point
while (*s) {
if (*s == '0' || *s == '1') {
sum *= 2.0;
sum += *s - '0';
power *= 0.5;
} else if (*s == dp) {
dp = '0';
power = 1.0;
} else {
return 0.0; // Unexpected char, maybe return NAN instead
}
s++;
}
if (dp == '0') { // If dp found ...
sum *= power;
}
return sum;
}
Can not be expressed exactly because the number, such as 0.1
(10) is an infinite decimal in binary. 无法准确表达,因为数字,例如
0.1
(10)是二进制的无限小数。
So I suggest that you convert the input as a string. 所以我建议你将输入转换为字符串。
#include <stdio.h>
#include <string.h>
double bstrtod(const char *bstr){
double x = 0, one = 1.0;
char *p = strchr(bstr, '.');
if(p){
char *fp = p;
while(*++fp){
one /= 2.0;
if(*fp=='1')
x += one;
}
} else {
p = strchr(bstr, '\0');
}
one = 1.0;
do{
if(*--p == '1')
x += one;
one *= 2.0;
}while(p!=bstr);
return x;
}
int main(void){
double x = bstrtod("101.101");
printf("%f\n", x);//5.625000
return 0;
}
This inaccuracy comes from errors during rounding off. 这种不准确性来自四舍五入期间的错误。 Even if you input 8, it actually stores 7.99999... That's why the problem occurs.
即使您输入8,它实际上存储了7.99999 ...这就是问题发生的原因。
Accurate Floating Point Binary to Decimal Converter 精确浮点二进制到十进制转换器
This code uses strings to convert a binary input to a decimal output 此代码使用字符串将二进制输入转换为十进制输出
#include <stdio.h>
#include <string.h>
double blembo(const char *blem);
int main(void)
{
char input[50];
gets(input);
int t = 0, flag = 0;
while (input[t] != '\0') // This whole block of code invalidates "hello" or "921" or "1.1.0" inputs
{
if(input[t] == '0' || input[t] == '1' || input[t] == '.')
{
if(input[t] == '.')
{
if(flag != 1)
{
flag = 1;
t++;
continue;
}
else
{
printf("\nIncorrect input\n");
return 0;
}
}
else
{
t++;
continue;
}
}
else
{
printf("\nIncorrect input\n");
return 0;
}
}
double output = blembo(input);
printf("%lf\n", output);
return 0;
}
double blembo (const char *blem)
{
double x=0, one = 1.0;
char *p , *fp;
p = strchr(blem, '.');
if(p)
{
fp = p;
while(*++fp)
{
one = one / 2.0;
if(*fp == '1')
{
x = x + one;
}
}
}
else
{
p = strchr(blem, '\0');
}
one = 1.0;
do
{
if(*--p == '1')
{
x = x + one;
}
one = one * 2.0;
}
while(p!=blem);
return x;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.