简体   繁体   English

IEEE-754 单精度表示的最大绝对和相对误差?

[英]Maximum absolute and relative error of IEEE-754 single precision representation?

I'm looking to find the maximum overall absolute and relative error of IEEE-754 single precision representation.我正在寻找 IEEE-754 单精度表示的最大总体绝对和相对误差。 Sign: 1 bit, Exponent: 8 bits, Significand: 23 bits.符号:1 位,指数:8 位,尾数:23 位。

I understood that when normalised, the maximum number of digits in the significand would be 23 (and we assume a sign bit and exponent of 8 obviously).我知道当归一化时,有效数字中的最大位数为 23(我们显然假设符号位和指数为 8)。 Hence if any extra digits turned up, then the error would propagate from 2^-24 onwards ie 2^-24, 2^-25, 2^-26... Hence I completed a geometric infinite sum of this to find an error: so i got 2^-23.因此,如果出现任何额外的数字,那么错误将从 2^-24 开始传播,即 2^-24、2^-25、2^-26...因此我完成了这个的几何无限求和以找到错误: 所以我得到 2^-23。 However, I'm unsure whether this is correct for the relative error.但是,我不确定这对于相对错误是否正确。 Relative error would be the ((true value-given value)/true value)*100.相对误差为((真值-给定值)/真值)*100。 I'm not sure if this is a wrong approach.我不确定这是否是错误的方法。

Additionally, I'm confused on how to find an absolute error.此外,我对如何找到绝对错误感到困惑。 Could anyone assist please.任何人都可以帮忙吗? Thanks in advance.提前致谢。

All finite IEEE-754 single precision are exact.所有有限的 IEEE-754 单精度都是精确的。 There is no error in the value itself.值本身没有错误

A calculation/conversion may incur an error as there are only about 2 32 different IEEE-754 single precision values and there are infinite possible calculations results.计算/转换可能会产生错误,因为只有大约 2 32 个不同的 IEEE-754 单精度值,并且有无限可能的计算结果。 Typically a nearby single precision value is selected when the true result is not encodable.当真实结果不可编码时,通常会选择附近的单精度值。

If we limit the discussion to calculation results that are within a pair of finite single precision values, then the error could be at most 1.0 ULP *1 .如果我们将讨论限制在一对有限单精度值内的计算结果,则误差最多为 1.0 ULP *1

Note: finite range +/-3.4028235... × 10 38 or FLT_MAX注意:有限范围 +/-3.4028235... × 10 38FLT_MAX

Within that range, the absolute difference between the true result and the encoded single precision is then at most FLT_MAX - next_smallest_float(FLOAT_MAX) .在该范围内,真实结果与编码单精度之间的绝对差值最多FLT_MAX - next_smallest_float(FLOAT_MAX) This is close to FLOAT_MAX * pow(2,-24) (about 2.03 * 10 31 ).这接近FLOAT_MAX * pow(2,-24) (大约 2.03 * 10 31 )。 Single precision has a 24-bit significand (23-bits explicitly encoded, 1 implied).单精度具有 24 位有效数(23 位显式编码,隐含 1)。

Outside that range the absolute error can be infinite.在该范围之外,绝对误差可能是无限大的。

For many calculations, when the results are in the normal single precision range, the relative error is within 1.0 * ULP of the correct answer *1 .对于许多计算,当结果在正常的单精度范围内时,相对误差在正确答案*11.0 * ULP以内。 For transcendental calculations like sine , the error is within 2.0 * ULP of the correct answer.对于sine等超越计算,误差在正确答案的2.0 * ULP范围内。 That can be much worse for weak implementations.对于弱实施来说,这可能会更糟。

When the true result is small and the single precision value is a non-zero sub-normal , the relative error grows as the true value nears 0.0 until 0.5 * pow(2,0) or 1/2.当真实结果较小且单精度值为非零次正常值时,相对误差会随着真实值接近 0.0 增加,直到0.5 * pow(2,0)或 1/2。 Note this is considering the relative error as:请注意,这是将相对误差视为:

relative_error_IEEE = |true value - IEEE value|/IEEE value

When the IEEE value is zero or the relative error is determined as below, the relative error approaches infinity.当 IEEE 值为零或相对误差如下确定时,相对误差接近无穷大。

relative_error_true = |true value - IEEE value|/true value

*1 Common calculations like +,-,*,/ should be within 0.5 ULP when the rounding mode is round-to-nearest . *1当舍入模式为round-to-nearest+,-,*,/等常见计算应在 0.5 ULP 以内。

The largest error is 10141204801825835211973625643008, and the largest relative error is 0.5:最大误差为10141204801825835211973625643008,最大相对误差为0.5:

>>> (2**(0xfe-150)* 0xffffff - 2**(0xfe-150)* 0xfffffe)/2 
10141204801825835211973625643008L
>>> 2**(0xfe-150)* 0xffffff
340282346638528859811704183484516925440L
>>> print ("%100.100f\n" % (10141204801825835211973625643007.0/340282346638528859811704183484516925440.0))
0.0000000298023241640522577793688714653530524856250849552452564239501953125000000000000000000000000000

>>> print("%151.151f\n" % ( ( 2**(0x0-150)* 0x000002 - 2**(0x0-150)* 0x000001 )/2 ))
0.0000000000000000000000000000000000000000000003503246160812042677309323958224790328200654854691289429392670709724477706714651503716595470905303955078125

>>> print("%151.151f\n" % (2**(0x0-150)* 0x00001))
0.0000000000000000000000000000000000000000000007006492321624085354618647916449580656401309709382578858785341419448955413429303007433190941810607910156250

>>> 3503246160812042677309323958224790328200654854691289429392670709724477706714651503716595470905303955078125.0/7006492321624085354618647916449580656401309709382578858785341419448955413429303007433190941810607910156250
0.5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将IEEE-754双精度和单精度转换为十进制Java错误 - Converting IEEE-754 double and single precision to decimal Java bug 将二进制字符串转换为IEEE-754单精度-Python - Convert a binary string into IEEE-754 single precision - Python 如何将601.0转换为IEEE-754单精度 - How to convert 601.0 to IEEE-754 Single Precision 尝试表示 2^(-23) 时,无法在单精度 IEEE-754 中掌握“转换引起的错误” - Trouble grasping "error due to conversion" in single-precision IEEE-754 when attempting to represent 2^(-23) C 中是否使用了 IEEE-754 表示? - Is IEEE-754 representation used in C? 如何将float转换为double(都存储在IEEE-754表示中)而不会丢失精度? - How to convert float to double(both stored in IEEE-754 representation) without losing precision? 将十六进制转换为IEEE-754单精度浮点二进制科学计数法 - Convert hexadecimal to IEEE-754 single precision floating point binary scientific notation 使用IEEE-754 Single Precision可以表示多少个标准化数字? - How many normalized numbers can be represented using IEEE-754 Single Precision? 如何生成随机(IEEE-754 单精度)浮点文件? - How do I generate a file of random (IEEE-754 single-precision) floats? IEEE-754 32位(单精度)指数-126而不是-127 - IEEE-754 32 Bit (single precision) exponent -126 instead of -127
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM