简体   繁体   English

astype float32对于整数的float64出错

[英]Error in astype float32 vs float64 for integer

I'm sure this is due to a lapse in my understanding in how casting between different precision of float works, but can someone explain why the value is getting cast as 3 less than its true value in 32 vs 64 bit representation? 我确信这是由于我对如何在不同的浮点精度之间进行投射的理解失效,但是有人可以解释为什么在32比64位表示中,该值的变为低于其真实值的3?

>>> a = np.array([83734315])
>>> a.astype('f')
array([ 83734312.], dtype=float32)
>>> a.astype('float64')
array([ 83734315.])

A 32-bit float can exactly represent about 7 decimal digits of mantissa. 32位浮点数可以精确地表示尾数的约7个十进制数字。 Your number requires more, and therefore cannot be represented exactly. 您的号码需要更多,因此无法准确表示。

The mechanics of what happens are as follows: 发生的机制如下:

A 32-bit float has a 24-bit mantissa. 32位浮点数具有24位尾数。 Your number requires 27 bits to be represented exactly, so the last three bits are getting truncated (set to zero). 您的数字需要精确表示27位,因此最后三位被截断(设置为零)。 The three lowest bits of your number are 011 2 ; 你的号码的三个最低位是011 2 ; these are getting set to 000 2 . 这些都设定为000 2 Observe that 011 2 is 3 10 . 观察到011 23 10

A float32 only has 24 bits of significand precision, which is roughly seven digits (log10(2**24) = 7.22). float32只有24位有效位精度,大约是7位数(log10(2 ** 24)= 7.22)。 You're expecting it to store an 8-digit number exactly, which in general is impossible. 您希望它能准确存储一个8位数字,这通常是不可能的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM