[英]Does casting double to float always return same value?
Does casting double
to float
always produce same result, or can there be some "rounding differences"? 铸造
double
float
总是产生相同的结果,还是会有一些“四舍五入的差异”?
For example, is x
in 例如,是
x
in
float x = (float)0.123456789d;
always the same value? 总是一样的价值?
What about when casting float to double, and then casting it back to float ie. 什么时候浮动加倍,然后再将其投射回浮动即。
(float)(double)someFloat
? (float)(double)someFloat
?
Mostly interested in what the results are in C#, but feel free to share if you have knowledge about how this works on other languages. 最感兴趣的是C#中的结果,但如果您对其他语言的工作原理有所了解,请随时分享。
The results should not be language dependent, unless the language deviates from the IEEE specification. 除非语言偏离IEEE规范,否则结果不应取决于语言。
All floats can be exactly represented as doubles, so the round trip from float to double to float should yield the same value that you started with. 所有浮点数都可以精确地表示为双精度数,因此从float到double到float的往返应该产生与您开始时相同的值。
Similarly, casting any double value to float should always yield the same result, but, of course, there are many different double values that would truncate to the same float value. 类似地,将任何double值转换为float应始终产生相同的结果,但是,当然,有许多不同的double值会截断为相同的float值。
If you downcast a double
to a float
, you are losing precision and data. 如果你垂头丧气一个
double
的float
,你失去的精度和数据。 Upcasting a float
to a double
is a widening conversion; 将
float
向上转换为double
是一个扩大的转换; no data is lost if it is then round-tripped...that is, unless you do something to the value prior to downcasting it back to a float. 如果然后往返,则不会丢失任何数据...也就是说,除非您在将其向下转换回浮点之前对该值执行某些操作 。
Floating-point numbers sacrifice precision and accuracy for range . 浮点数会牺牲范围的精度和精度。 Single-precision floats give you 32-bits of precision;
单精度浮点数可提供32位精度; double-precision give you 64-bits.
双精度给你64位。 But they can represent values way outside the bounds that the underlying precision would indicate.
但它们可以表示超出基础精度指示范围的值。
C# float
and double
are IEEE 754 floating point values. C#
float
和double
是IEEE 754浮点值。
float
is a single-precision IEEE 754 value (32 bits) and consists of a float
是单精度IEEE 754值 (32位),由a组成
double
is double-precision IEEE 754 value (64 bits) and consists of a double
是双精度IEEE 754值 (64位)并由a组成
The effective precision of the mantissa is 1-bit more than its apparent size (floating point magick). 尾数的有效精度比其表观大小(浮点魔法)高1位。
Some CLR floating point resources for you: 一些CLR浮点资源为您服务:
This paper is probably the canonical paper on the perils and pitfalls of floating point arithmetic. 本文可能是关于浮点运算的危险和陷阱的规范性论文。 If you're not a member of the ACM, click the link on the title to find public downloads of the article:
如果您不是ACM的成员,请单击标题上的链接以查找该文章的公共下载:
Abstract
抽象
Floating-point arithmetic is considered as esoteric subject by many people.浮点算术被许多人视为深奥的主题。 This is rather surprising, because floating-point is ubiquitous in computer systems: Almost every language has a floating-point datatype;
这是相当令人惊讶的,因为浮点在计算机系统中无处不在:几乎每种语言都有浮点数据类型; computers from PCs to supercomputers have floating-point accelerators;
从PC到超级计算机的计算机都有浮点加速器; most compilers will be called upon to compile floating-point algorithms from time to time;
大多数编译器都会被要求不时编译浮点算法; and virtually every operating system must respond to floating-point exceptions such as overflow.
几乎每个操作系统都必须响应溢出等浮点异常。 This paper presents a tutorial on the aspects of floating-point that have a direct impact on designers of computer systems.
本文提供了一个关于浮点方面的教程,它对计算机系统的设计者有直接影响。 It begins with background on floating-point representation and rounding error, continues with a discussion of the IEEE floating point standard, and concludes with examples of how computer system builders can better support floating point.
它首先介绍浮点表示和舍入误差,继续讨论IEEE浮点标准,最后举例说明计算机系统构建器如何更好地支持浮点。
In some cases, the closest float
representation to a numeric quantity may differ from the value obtained by rounding the closest double
representation to a float
. 在某些情况下,与数字量最接近的
float
表示可能与通过将最近的double
表示舍入到float
获得的值不同。 Two such quantities are 12,344,321.4999999991 and 12,345,678.50000000093. 两个这样的数量是12,344,321.4999999991和12,345,678.50000000093。 The integers above and below both those quantities are precisely representable as
float
, but the nearest double
to each of them has a fractional part of precisely 0.5. 这些量的上下两个整数可以精确地表示为
float
,但是它们中最接近的double
部分的精度为0.5的小数部分。 Because converting such double
values (between 2^23 and 2^24, with a fraction of precisely 0.5) to float
will round to the nearest even integer; 因为将这样的
double
精度值(在2 ^ 23和2 ^ 24之间,精确到0.5的一小部分)转换为float
将舍入到最接近的偶数整数; the compiler will in each case end up rounding away from the value which would have been closer to the original number. 在每种情况下,编译器将最终四舍五入远离原始数字的值。
Note that in practice, the compiler seems to parse numbers as double
, and then convert to float
, so even though 12344321.4999999991f should round to 12344321f, it instead rounds to 12344322f. 请注意,在实践中,编译器似乎将数字解析为
double
,然后转换为float
,因此即使12344321.4999999991f应该舍入到12344321f,它也会舍入到12344322f。 Likewise 12345678.50000000093f should rounds to 12345679f but rounds to 12345678f, so even in cases where conversion to double
and then float
loses precision, such conversion loss cannot be avoided by specifying numbers directly as float
. 同样,12345678.50000000093f应该舍入到12345679f但是舍入到12345678f,所以即使在转换为
double
然后float
失去精度的情况下,通过将数字直接指定为float
也无法避免这种转换损失。
Incidentally, the values 12344321.4999999992f and 12345678.50000000094f are rounded correctly. 顺便提及,值12344321.4999999992f和12345678.50000000094f被正确舍入。
Considering that they have different precision, even i you're casting from less precision to wider one (I suppose that is actually your doubt) the result can not be always the same. 考虑到它们具有不同的精度,即使我从较低精度到较宽精度(我认为这实际上是您的怀疑),结果也不一定相同。
Floating point operations, especially casting, are always a subject of truncating/rounding and any other type of approximation . 浮点运算,尤其是转换,总是截断/舍入和任何其他类型的近似 。
A double should be able to to exactly hold every possible value of a float. double应该能够准确地保存float的每个可能值。 Casting a float to a double should not change the value, and casting back to a float should return the original value, as long as you didn't perform any calculations on the double in the meantime.
将float转换为double不应更改该值,并且只要在此期间未对double执行任何计算,则返回到float应返回原始值。
Floating-point numbers in C# are stored using the IEEE 754 format (http://en.wikipedia.org/wiki/IEEE_754). C#中的浮点数使用IEEE 754格式(http://en.wikipedia.org/wiki/IEEE_754)存储。 This format has two parts: the digits and the exponent.
这种格式有两部分:数字和指数。 Doubles hold 52 digits, and floats hold 23 digits.
双打保持52位数,浮动保持23位数。 The base is 2, not ten.
基数是2,而不是10。 So for your example above (0.123456789), the digits would be 111010110111100110100010101 (the binary representation of 123456789).
因此,对于上面的示例(0.123456789),数字将是111010110111100110100010101(123456789的二进制表示)。 That's 27 digits, which fits comfortably in a double, but not in a float, so yes, precision would be lost in the round-trip conversion.
这是27位数,适合双倍,但不是浮动,所以是的,在往返转换中精度会丢失。
On the other hand, if your number was 0.123456, the digits would be 11110001001000000 (17 digits) which fits comfortably in either a float or a decimal, so you would lose no precision in a round-trip cast. 另一方面,如果您的数字是0.123456,则数字将是11110001001000000(17位数),它可以很好地适合浮点数或小数,因此您在往返演员表中将失去精确度。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.