简体   繁体   English

印花浮子,保持精度

[英]printing float, preserving precision

I am writing a program that prints floating point literals to be used inside another program. 我正在编写一个打印浮点文字的程序,以便在另一个程序中使用。

How many digits do I need to print in order to preserve the precision of the original float? 我需要打印多少位才能保持原始浮点的精度?

Since a float has 24 * (log(2) / log(10)) = 7.2247199 decimal digits of precision, my initial thought was that printing 8 digits should be enough. 由于浮点数具有24 * (log(2) / log(10)) = 7.2247199精度的十进制数字,我最初的想法是打印8位数就足够了。 But if I'm unlucky, those 0.2247199 get distributed to the left and to the right of the 7 significant digits, so I should probably print 9 decimal digits. 但如果我运气不好,那些0.2247199会分配到7位有效数字的左侧和右侧,所以我应该打印9位小数。

Is my analysis correct? 我的分析是否正确? Is 9 decimal digits enough for all cases? 所有情况下都是9位十进制数字吗? Like printf("%.9g", x); printf("%.9g", x); ?

Is there a standard function that converts a float to a string with the minimum number of decimal digits required for that value, in the cases where 7 or 8 are enough, so I don't print unnecessary digits? 是否有标准函数将float转换为具有该值所需的最小小数位数的字符串,在7或8足够的情况下,所以我不打印不必要的数字?

Note: I cannot use hexadecimal floating point literals, because standard C++ does not support them. 注意:我不能使用十六进制浮点文字,因为标准C ++不支持它们。

In order to guarantee that a binary->decimal->binary roundtrip recovers the original binary value, IEEE 754 requires 为了保证二进制 - >十进制 - >二进制往返恢复原始二进制值, IEEE 754要求


The original binary value will be preserved by converting to decimal and back again using:[10]

    5 decimal digits for binary16
    9 decimal digits for binary32
    17 decimal digits for binary64
    36 decimal digits for binary128

For other binary formats the required number of decimal digits is

    1 + ceiling(p*log10(2)) 

where p is the number of significant bits in the binary format, e.g. 24 bits for binary32.

In C, the functions you can use for these conversions are snprintf() and strtof/strtod/strtold(). 在C中,可用于这些转换的函数是snprintf()和strtof / strtod / strtold()。

Of course, in some cases even more digits can be useful (no, they are not always "noise", depending on the implementation of the decimal conversion routines such as snprintf() ). 当然,在某些情况下甚至更多的数字可能是有用的(不,它们并不总是“噪声”,这取决于十进制转换例程的实现,例如snprintf())。 Consider eg printing dyadic fractions . 考虑例如打印二元分数

24 * (log(2) / log(10)) = 7.2247199 24 *(log(2)/ log(10))= 7.2247199

That's pretty representative for the problem. 这对问题很有代表性。 It makes no sense whatsoever to express the number of significant digits with an accuracy of 0.0000001 digits. 没有任何意义表达有效位数,精度为0.0000001位。 You are converting numbers to text for the benefit of a human , not a machine. 您正在将数字转换为文本,以造福人类 ,而不是机器。 A human couldn't care less, and would much prefer, if you wrote 如果你写的话,一个人不会在意,也不会更在意

24 * (log(2) / log(10)) = 7 24 *(log(2)/ log(10))= 7

Trying to display 8 significant digits just generates random noise digits. 试图显示8位有效数字只会生成随机噪声数字。 With non-zero odds that 7 is already too much because floating point error accumulates in calculations. 由于浮点误差在计算中累积,因此非零赔率7已经过多。 Above all, print numbers using a reasonable unit of measure. 最重要的是,使用合理的度量单位打印数字。 People are interested in millimeters, grams, pounds, inches, etcetera. 人们对毫米,克,磅,英寸等感兴趣。 No architect will care about the size of a window expressed more accurately than 1 mm. 没有建筑师会关心窗口的尺寸比1毫米更准确。 No window manufacturing plant will promise a window sized as accurate as that. 没有窗户制造厂会承诺一个像那样精确的窗户。

Last but not least, you cannot ignore the accuracy of the numbers you feed into your program. 最后但同样重要的是,您不能忽略您输入程序的数字的准确性。 Measuring the speed of an unladen European swallow down to 7 digits is not possible. 将空载欧洲燕子的速度测量到7位数是不可能的。 It is roughly 11 meters per second, 2 digits at best. 它大约是每秒11米,最多2位数。 So performing calculations on that speed and printing a result that has more significant digits produces nonsensical results that promise accuracy that isn't there. 因此,对该速度执行计算并打印具有更多有效数字的结果会产生无意义的结果,从而保证不存在的准确性。

If the program is meant to be read by a computer, I would do the simple trick of using char* aliasing. 如果该程序是由计算机读取的,我会做一个使用char*别名的简单技巧。

  • alias float* to char* 别名float* to char*
  • copy into an unsigned (or whatever unsigned type is sufficiently large) via char* aliasing 通过char*别名复制到unsigned (或任何无符号类型足够大)
  • print the unsigned value 打印unsigned

Decoding is just reversing the process (and on most platform a direct reinterpret_cast can be used). 解码只是逆转过程(在大多数平台上,可以使用直接reinterpret_cast )。

如果你有一个符合C99的C库(如果你的float类型的基数是2的幂:), printf格式字符%a可以打印浮点值而不会以十六进制形式的精度,并且实用程序如scanfstrod将能够读取它们。

The floating-point-to-decimal conversion used in Java is guaranteed to be produce the least number of decimal digits beyond the decimal point needed to distinguish the number from its neighbors (more or less). Java中使用的浮点到十进制转换保证产生超出小数点的最小十进制数,以区分数字与其邻居(或多或少)。

You can copy the algorithm from here: http://www.docjar.com/html/api/sun/misc/FloatingDecimal.java.html Pay attention to the FloatingDecimal(float) constructor and the toJavaFormatString() method. 您可以从此处复制算法: http//www.docjar.com/html/api/sun/misc/FloatingDecimal.java.html注意FloatingDecimal(float)构造函数和toJavaFormatString()方法。

If you read these papers (see below), you'll find that there are some algorithm that print the minimum number of decimal digits such that the number can be re-interpreted unchanged (ie by scanf). 如果您阅读这些论文(见下文),您会发现有一些算法可以打印最小的十进制数字,以便可以不加改变地重新解释数字(即通过scanf)。

Since there might be several such numbers, the algorithm also pick the nearest decimal fraction to the original binary fraction (I named float value). 由于可能有几个这样的数字,算法也会选择最接近原始二进制分数的小数部分(我命名为浮点值)。

A pity that there's no such standard library in C. 遗憾的是C中没有这样的标准库。

You can use sprintf . 你可以使用sprintf I am not sure whether this answers your question exactly though, but anyways, here is the sample code 我不确定这是否能完全回答你的问题,但无论如何,这里是示例代码

#include <stdio.h>
int main( void )
{
float d_n = 123.45;
char s_cp[13] = { '\0' };
char s_cnp[4] = { '\0' };
/*
* with sprintf you need to make sure there's enough space
* declared in the array
*/
sprintf( s_cp, "%.2f", d_n );
printf( "%s\n", s_cp );
/*
* snprinft allows to control how much is read into array.
* it might have portable issues if you are not using C99
*/
snprintf( s_cnp, sizeof s_cnp - 1 , "%f", d_n );
printf( "%s\n", s_cnp );
getchar();
return 0;
}
/* output :
* 123.45
* 123
*/

With something like 有类似的东西

def f(a):
    b=0
    while a != int(a): a*=2; b+=1
    return a, b

(which is Python) you should be able to get mantissa and exponent in a loss-free way. (这是Python)你应该能够以无损方式获得尾数和指数。

In C, this would probably be 在C中,这可能是

struct float_decomp {
    float mantissa;
    int exponent;
}

struct float_decomp decomp(float x)
{
    struct float_decomp ret = { .mantissa = x, .exponent = 0};
    while x != floor(x) {
        ret.mantissa *= 2;
        ret.exponent += 1;
    }
    return ret;
}

But be aware that still not all values can be represented in that way, it is just a quick shot which should give the idea, but probably needs improvement. 但请注意,仍然不是所有的值都可以用这种方式表示,它只是一个快速的镜头,应该给出这个想法,但可能需要改进。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM