简体   繁体   English

双精度 - 最大值

[英]Double precision - Max value

I have a pretty silly question on double precision. 关于双精度,我有一个非常愚蠢的问题。 I have read that a double (in C for example) is represented on 64 bits, but I have also read that the maximum value that can be represented by a double is approximately 10^308. 我已经读过,在64位上表示了一个double(例如在C中),但我还读到可以用double表示的最大值大约是10 ^ 308。 How can 10^308 be represented with only 64 bits? 10 ^ 308如何只用64位表示?

It will not hold the 308 digits of the 10^308 number. 它不会保存10 ^ 308数字的308位数。 The double precision number holds the exponent and a limited number of digits. 双精度数保存指数和有限数量的数字。

See https://en.wikipedia.org/wiki/IEEE_floating_point (english) http://fr.wikipedia.org/wiki/IEEE_754 (french) for a detailed description of floating points encoding in memory. https://en.wikipedia.org/wiki/IEEE_floating_point (英制) http://fr.wikipedia.org/wiki/IEEE_754 (法国),用于浮点编码中存储器的详细描述。

According to C standard, there are three floating point types: float , double , and long double , and the value representation of all floating-point types are implementation defined . 根据C标准,有三种浮点类型: floatdoublelong double ,所有浮点类型的值表示都是实现定义的

Most compilers however do follow the binary64 format, as specified by the IEEE 754 standard. 但是,大多数编译器都遵循IEEE 754标准规定的二进制64格式。

This format has: 这种格式有:

  • 1 sign bit 1个符号位
  • 11 bits for exponent 指数为11位
  • 52 bits for mantissa 尾数为52位

To find the largest value double can hold, you should check the DBL_MAX defined in the header <float.h> . 要查找double可以容纳的DBL_MAX ,应检查标头<float.h>定义的DBL_MAX It will be approximately 1.8 × 10 308 for implementations using binary64 IEEE 754 standard. 对于使用binary64 IEEE 754标准的实现,它将约为1.8×10 308

There is an exponent in the bit pattern of the 64 bits floating point IEEE numbers. 在64位浮点IEEE数的位模式中存在指数。 In Python I compute the following: 在Python中我计算以下内容:

>>> import numpy as np
>>> 2**(-52) == np.finfo(np.float64).eps
True
>>> np.finfo(np.float64).max
1.7976931348623157e+308
>>> (2-2**(-52)) * 2**(2**10-1)
1.7976931348623157e+308
>>> (2-2**(-52)) * 2**(2**10-1) == np.finfo(np.float64).max
True

So it is a bit more than 10^308. 所以它有点超过10 ^ 308。 The "2**(2**10-1)" is the exponent part. “2 **(2 ** 10-1)”是指数部分。 See also https://en.wikipedia.org/wiki/Double-precision_floating-point_format 另见https://en.wikipedia.org/wiki/Double-precision_floating-point_format

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM