astype float32对于整数的float64出错

Question

I'm sure this is due to a lapse in my understanding in how casting between different precision of float works, but can someone explain why the value is getting cast as 3 less than its true value in 32 vs 64 bit representation? 我确信这是由于我对如何在不同的浮点精度之间进行投射的理解失效，但是有人可以解释为什么在32比64位表示中，该值的变为低于其真实值的3？

>>> a = np.array([83734315])
>>> a.astype('f')
array([ 83734312.], dtype=float32)
>>> a.astype('float64')
array([ 83734315.])

Answer 1

A 32-bit float can exactly represent about 7 decimal digits of mantissa. 32位浮点数可以精确地表示尾数的约7个十进制数字。 Your number requires more, and therefore cannot be represented exactly. 您的号码需要更多，因此无法准确表示。

The mechanics of what happens are as follows: 发生的机制如下：

A 32-bit float has a 24-bit mantissa. 32位浮点数具有24位尾数。 Your number requires 27 bits to be represented exactly, so the last three bits are getting truncated (set to zero). 您的数字需要精确表示27位，因此最后三位被截断（设置为零）。 The three lowest bits of your number are 011 ₂ ; 你的号码的三个最低位是011 ₂ ; these are getting set to 000 ₂ . 这些都设定为000 ₂ 。 Observe that 011 ₂ is 3 ₁₀ . 观察到011 ₂是3 ₁₀ 。

Answer 2

A float32 only has 24 bits of significand precision, which is roughly seven digits (log10(2**24) = 7.22). float32只有24位有效位精度，大约是7位数（log10（2 ** 24）= 7.22）。 You're expecting it to store an 8-digit number exactly, which in general is impossible. 您希望它能准确存储一个8位数字，这通常是不可能的。

astype float32对于整数的float64出错

问题描述

2 个解决方案

解决方案1
4 已采纳 2013-11-08 15:35:39

解决方案2
3 2013-11-08 15:35:46

astype float32对于整数的float64出错

问题描述

2 个解决方案

解决方案1 4 已采纳 2013-11-08 15:35:39

解决方案2 3 2013-11-08 15:35:46

解决方案1
4 已采纳 2013-11-08 15:35:39

解决方案2
3 2013-11-08 15:35:46