简体   繁体   English

浮点数的小数部分中以 10 为底的最大位数是多少

[英]What Are the Maximum Number of Base-10 Digits in the Fractional Part of a Floating Point Number

If the a floating point number could be outputted so that there was no truncation of value (say with setpercision ) and the number was outputted in fixed notation (say with fixed ) what is the buffer size that would be required to guarantee the entire fractional part of the floating point number could be stored in the buffer?如果可以输出浮点数以便没有截断值(例如使用setpercision )并且数字以固定表示法输出(例如使用fixed ),则保证整个小数部分所需的缓冲区大小是多少的浮点数可以存储在缓冲区中吗?

I'm hoping there is something in the standard, like a #define or something in numeric_limits which would tell me the maximum base-10 value place of the fractional part of a floating point type.我希望标准中有一些东西,比如#definenumeric_limits中的东西,它会告诉我浮点类型的小数部分的最大基数为 10 的值位置。

I asked about the maximum number of base-10 digits in the fractional part of a floating point type here: What Are the Maximum Number of Base-10 Digits in the Integral Part of a Floating Point Number我在这里询问了浮点类型的小数部分中基 10 位的最大数量:浮点数的整数部分中基 10 位的最大数量是多少

But I realize this may be more complex.但我意识到这可能更复杂。 For example, 1.0 / 3.0 is an infinitely repeating series of numbers.例如, 1.0 / 3.0是一个无限重复的数字系列。 When I output that using fixed formatting I get this many places before repeating 0s:当我使用fixed格式输出时,我会在重复 0 之前得到这么多位置:

0.333333333333333314829616256247390992939472198486328125 0.333333333333333314829616256247390992939472198486328125

But I can't necessarily say that's the maximum precision, cause I don't know how many of those trailing 0s were actually represented in the floating point's fraction, and it hasn't been shifted down by a negative exponent.但我不一定说这是最大精度,因为我不知道有多少尾随 0 实际表示在浮点分数中,并且它没有被负指数向下移动。

I know we have min_exponent10 is that what I should be looking to for this?我知道我们有min_exponent10是我应该寻找的吗?

If you consider the 32 and 64 bit IEEE 754 numbers, it can be calculated as described below.如果考虑 32 位和 64 位 IEEE 754 数字,则可以按如下所述进行计算。

It is all about negative powers of 2. So lets see how each exponent contribute:这完全是关于 2 的负幂。所以让我们看看每个指数如何贡献:

2^-1 = 0.5         i.e. 1 digit
2^-2 = 0.25        i.e. 2 digits
2^-3 = 0.125       i.e. 3 digits
2^-4 = 0.0625      i.e. 4 digits
....
2^-N = 0.0000..    i.e. N digits

as the base-10 numbers always end with 5, you can see that the number of base-10 digits increase by 1 when the exponent descrease by 1. So 2^(-N) will require N digits由于基数为 10 的数字总是以 5 结尾,因此您可以看到,当指数减少 1 时,基数为 10 的数字的数量会增加 1。因此 2^(-N) 将需要 N 位数字

Also notice that when adding those contributions, the number of resulting digits is determined by the smallest number.另请注意,在添加这些贡献时,结果位数由最小的数字决定。 So what you need to find out is the smallest exponent that can contribute.因此,您需要找出可以做出贡献的最小指数。

For 32 bit IEEE 754 you have:对于 32 位 IEEE 754,您有:

Smallest exponent -126最小指数 -126

Fraction bits 23分数位 23

So the smallest exponent is -126 + -23 = -149, so the smallest contribution will come from 2^-149, ie所以最小的指数是-126 + -23 = -149,所以最小的贡献将来自2^-149,即

For 32 bit IEEE 754 printed in base-10 there can be 149 fractional digits对于以 base-10 打印的 32 位 IEEE 754,可以有 149 个小数位

For 64 bit IEEE 754 you have:对于 64 位 IEEE 754,您有:

Smallest exponent -1022最小指数 -1022

Fraction bits 52分数位 52

So the smallest exponent is -1022 + -52 = -1074, so the smallest contribution will come from 2^-1074, ie所以最小的指数是-1022 + -52 = -1074,所以最小的贡献将来自2^-1074,即

For 64 bit IEEE 754 printed in base-10 there can be 1074 fractional digits对于以 base-10 打印的 64 位 IEEE 754,可以有 1074 个小数位

I'm reasonably certain the standard doesn't (and can't, without imposing other restrictions) provide a pre-defined constant to specify the number you're asking for.我有理由确定标准没有(并且不能,在不施加其他限制的情况下)提供预定义的常量来指定您要求的数字。

Floating point is most often represented in base 2, but base 16 and base 10 are also in reasonably wide use.浮点数最常以基数 2 表示,但基数 16 和基数 10 也有相当广泛的使用。

In all of these cases, the only factors in the base (2 and possibly 5) are also factors of 10. As a result, we never get an infinitely repeating number when converting from them to base 10 (decimal).在所有这些情况下,基数中的唯一因数(2 和可能是 5)也是 10 的因数。因此,从它们转换为基数 10(十进制)时,我们永远不会得到无限重复的数字。

The standards don't restrict floating point to such representations though.不过,标准并不将浮点数限制为此类表示。 In theory, if somebody really wanted to they could use (for example) base 3 or base 7 for their floating point representation.理论上,如果有人真的想要,他们可以使用(例如)基数 3 或基数 7 来表示浮点数。 If they did so, it would be trivial to store a number that would repeat indefinitely when converted to decimal.如果他们这样做,存储一个在转换为十进制时会无限重复的数字将是微不足道的。 For example 0.1 in base 3 would represent 1/3, which repeats infinitely when converted to base 10. Although I've never heard of anybody doing it, I believe such an implementation could meet the requirements of the standard.例如,基数 3 中的 0.1 表示 1/3,当转换为基数 10 时,它会无限重复。虽然我从未听说有人这样做过,但我相信这样的实现可以满足标准的要求。

For a typical binary representation, min_exponent should probably be a reasonable proxy for the value you want.对于典型的二进制表示, min_exponent可能应该是您想要的值的合理代理。 Unfortunately, it's probably not possible to state things much more precisely than that though.不幸的是,可能不可能比这更精确地陈述事情。

For example, an implementation is allowed to store intermediate values to greater precision than it stores in memory, so it's possible that (for example) if you give 1.0/3.0 literally in your source code, the result could actually differ form the value produced by reading a pair of inputs at run time, entering 1 and 3, and dividing them.例如,允许实现以比在内存中存储的精度更高的精度存储中间值,因此有可能(例如)如果您在源代码中按字面意思给出1.0/3.0 ,结果实际上可能与产生的值不同在运行时读取一对输入,输入 1 和 3,然后将它们相除。 In the former case, the division might be carried out at compile time, so the result you printed out would be exactly the size of a double , with no extra.在前一种情况下,除法可能在编译时执行,因此您打印的结果将是double的大小,没有额外的。 When you enter the two values at run time, the division would be carried out at run time, and you might get a result with higher precision.当您在运行时输入这两个值时,将在运行时进行除法,您可能会得到精度更高的结果。

The standard does also require that the base of the floating point be documented as std::numeric_limits<T>::radix .该标准还要求将浮点的基数记录为std::numeric_limits<T>::radix Based on this, you could compute an approximation of the maximum number of places after the decimal point based on radix min_exponent , as long as the prime factors of the radix were shared with the prime factors of 10.基于此,您可以基于基数min_exponent计算小数点后最大位数的近似值,只要基数的质因数与 10 的质因数共享。

For 64-bit IEEE double precision, the greatest number of significant digits in an exact decimal conversion is 767. This is the exact decimal representation of the value with the least exponent value (1) and the most fraction bits set (53).对于 64 位 IEEE 双精度,精确十进制转换中的最大有效位数为 767。这是具有最少指数值 (1) 和最多小数位集 (53) 的值的精确十进制表示。 (The largest subnormal value has the same number of significant decimal digits.) (最大的次正常值具有相同数量的有效十进制数字。)

0x1fffffffff: 6.-313

You don't really want to know how many "digits are in the fractional part", this statement shows that you're not 100% clear on what is happening under the hood in a floating point representation.您并不是真的想知道有多少“小数部分中的数字”,这个陈述表明您并不是 100% 清楚浮点表示中发生的事情。 There is not a separate precision for the integer and the fractional part.整数和小数部分没有单独的精度。

What you really want to know is the precision of the representation .您真正想知道的是表示精度

1) A 32-bit, single-precision IEEE754 number has 24 mantissa bits, which gives about 24 * log10(2) = 7.2 digits of precision. 1) 一个 32 位的单精度 IEEE754 数有 24 个尾数位,这给出了大约24 * log10(2) = 7.2位的精度。

2) A 64-bit, double precision IEEE754 number has 53 mantissa bits, which gives about 53 * log10(2) = 16.0 digits of precision. 2) 64 位、双精度 IEEE754 数有 53 个尾数位,这给出了大约53 * log10(2) = 16.0位精度。

Suppose you're working with double precision numbers.假设您正在处理双精度数字。 If you have a very small base-10 number, say between 0 and 1, then you will have about 16 decimal digits of precision after the decimal point.如果您的基数为 10 的数字非常小,例如介于 0 和 1 之间,那么小数点后的精度将约为 16 位。 This is what your 1.0/3.0 example shows above- you know that the answer should be 0.3 repeating, but you have sixteen threes after the decimal point before the answer turns into nonsense.这就是上面的1.0/3.0示例显示的内容 - 您知道答案应该是 0.3 重复,但是在答案变成无意义之前,小数点后有十六个三。

If you have a very large number, say a billion divided by three ( 1000000000.0/3.0 ) then on my machine the answer will look something like this:如果你有一个非常大的数字,比如十亿除以三( 1000000000.0/3.0 ),那么在我的机器上,答案看起来像这样:

1000000000.0/3.0 = 333333333.333333313465118

In this case you still have about 16 digits of precision , but the precision is split across the integral and fractional part.在这种情况下,您仍然有大约 16 位的precision ,但精度分为整数部分和小数部分。 There are 9 precise digits in the integral part, and 7 precise digits in the fractional part.整数部分有 9 位精确数字,小数部分有 7 位精确数字。 The eight digit onwards in the fractional part is garbage.小数部分的八位数字是垃圾。

Likewise, suppose we divide one quintillion (18 zeroes) by three.同样,假设我们将一五亿(18 个零)除以三。 On my machine:在我的机器上:

1000000000000000000.0/3.0 = 333333333333333312.000000000000000

You still have sixteen digits of precision, but zero of those digits are after the decimal point.您仍然有十六位精度,但这些数字中的零在小数点后。

std::numeric_limits<double>::min_exponent

Minimum negative integer value such that radix raised to (min_exponent-1) generates a normalized floating-point number.最小负整数值,使得基数提高到 (min_exponent-1) 生成归一化浮点数。 Equivalent to FLT_MIN_EXP, DBL_MIN_EXP or LDBL_MIN_EXP for floating types.对于浮动类型,等效于 FLT_MIN_EXP、DBL_MIN_EXP 或 LDBL_MIN_EXP。

min_exponent10 also is available. min_exponent10也可用。

Minimum negative integer value such that 10 raised to that power generates a normalized floating-point number.最小负整数值,使得 10 的该次方生成归一化浮点数。 Equivalent to FLT_MIN_10_EXP, DBL_MIN_10_EXP or LDBL_MIN_10_EXP for floating types.相当于浮动类型的 FLT_MIN_10_EXP、DBL_MIN_10_EXP 或 LDBL_MIN_10_EXP。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM