简体   繁体   English

使用比格式支持的数字更高的精度来显示数字时,会写出哪些数据?

[英]What data is written out when a higher precision is used to display a number than the one supported by the format?

The IEEE 754 double precision floating point format has a binary precision of 53 bits, which translates into log10(2^53) ~ 16 significant decimal digits. IEEE 754双精度浮点格式的二进制精度为53位,可以转换为log10(2 ^ 53)〜16个有效十进制数字。

If the double precision format is used to store a floating point number in a 64 bit-long word in the memory, with 52 bits for the significand and 1 hidden bit, but a larger precision is used to output the number to the screen, what data is actually read from the memory and written to the output? 如果使用双精度格式将浮点数存储在内存中64位长的字中,其中有效位52位,隐藏位1位, 但是使用较大的精度将数字输出到屏幕,该怎么办?数据实际上是从内存中读取并写入输出的?

How can it even be read, when the total length of the word is 64 bit, does the read-from-memory operation on the machine just simply read more bits and interprets them as an addition to the significand of the number? 当单词的总长度为64位时,如何读取机器上的“从内存读取”操作是否只是读取更多位并将其解释为数字的有效位数呢?

For example, take the number 0.1. 例如,取数字0.1。 It does not have an exact binary floating point representation regardless of the precision used, because it has an indefinitely repeating binary floating point pattern in the significand. 无论使用哪种精度,它都没有精确的二进制浮点表示形式,因为它的有效位数具有无限重复的二进制浮点模式。

If 0.1 is stored with the double precision, and printed to the screen with the precision >16 like this in the C++ language: 如果以双精度存储0.1,并使用C ++语言以这样的精度> 16打印到屏幕:

#include <iostream> 
#include <iomanip> 

using namespace std;

int main()
{
    double x = 0.1; 
    cout << setprecision(50) << "x= " << x << endl;
}; 

The output (on my machine at the point of execution), is: 输出(在执行时在我的机器上)是:

x = 0.1000000000000000055511151231257827021181583404541 x = 0.1000000000000000055511151231257827021181583404541

If the correct rounding is used with 2 guard bits and 1 sticky bits, can I trust the decimal values given by the first three non-zero binary floating point digits in the error 5.551115123125783e-17? 如果正确的舍入与2个保护位和1个粘性位一起使用,我是否可以相信错误5.551115123125783e-17中前三个非零二进制浮点数给出的十进制值?

Every binary fraction is exactly equal to some decimal fraction. 每个二进制分数都等于某个十进制分数。 If, as is usually the case, double is a binary floating point type, each double number has an exactly equal decimal representation. 如果通常情况下double是二进制浮点类型,则每个double数字都具有完全相等的十进制表示形式。

For what follows, I am assuming your system uses IEEE 754 64-bit binary floating point to represent double . 对于以下内容,我假设您的系统使用IEEE 754 64位二进制浮点数来表示double That is not required by the standard, but is very common. 这不是标准要求的,但是很常见。 The closest number to 0.1 in that format has exact value 0.1000000000000000055511151231257827021181583404541015625 该格式中最接近0.1数字具有精确值0.1000000000000000055511151231257827021181583404541015625

Although this number has a lot of digits, it is exactly equal to 3602879701896397/2 55 . 尽管此数字有很多数字,但它完全等于3602879701896397/2 55 Multiplying both numerator and denominator by 5 55 converts it to a decimal fraction, while increasing the number of digits in the numerator. 分子和分母都乘以5 55会将其转换为小数,同时增加分子的位数。

One common approach, consistent with the result in the question, is to use round-to-nearest to the number of digits required by the format. 与问题的结果相一致的一种常用方法是对格式要求的位数进行舍入取整。 That will indeed give useful information about the rounding error on conversion of a string to double . 实际上,这将提供有关将字符串转换为double舍入错误的有用信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 输出精度高于双精度 - Output precision is higher than double precision 解析字符串浮点时精度更高 - Higher precision when parsing string to float 线程数高于核心数 - Number of Threads higher than the Number of Cores 解析(浮点)数字时使用什么信息? - What information is used when parsing a (float) number? 使用与用于编译实际代码的版本不同(更高)的 gcc 版本构建 strip/binutils 会导致任何问题吗? - Does building strip/binutils with a different(higher) version of gcc than the one used to compile the actual code cause any issue? 测得的fps高于理论值 - Measured fps is higher than theoretical one 当`healt`不能高于`maxHealth`时,如何在一个对象中定义`health`和`maxHealth`? - How to define `health` and `maxHealth` in one object, when `healt` can't be higher than `maxHealth`? 下一个更高的数字,一个零位 - Next higher number with one zero bit GNU MPFR 在精度高于 64 时给我完全相同的结果 - GNU MPFR give me exact same results on precision higher than 64 通常用于将输入映射到显示器的数据结构是什么? - What is the data structure typically used to map input to the display?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM