[英]Difference in rounding - float64 vs. float32
This scenario is a simplification of an ETL scenario involving multiple sets of data pulled from MySQL tables.此场景是 ETL 场景的简化,涉及从 MySQL 表中提取的多组数据。 I have a merged dataframe where one price column is type float64
and the other is type object
.我有一个合并的 dataframe ,其中一个价格列是float64
类型,另一个是object
类型。
import pandas as pd
df = pd.DataFrame({
'price1': [0.066055],
'price2': ['0.066055'],
})
>>> df.dtypes
price1 float64
price2 object
dtype: object
When these two columns are converted to float64
, the column price1
is rounded incorrectly when rounded to 5 digits.当这两列转换为float64
时,列price1
在四舍五入为 5 位时被错误地四舍五入。
float64_df = df[price_cols].apply(lambda x: pd.to_numeric(x))
>>> float64_df.dtypes
price1 float64
price2 float64
dtype: object
>>> float64_df[price_cols].apply(lambda x: x.round(5))
price1 price2
0 0.06606 0.06605
However, when the columns are converted to float32
using downcast='float'
, the rounding works as expected.但是,当使用downcast='float'
将列转换为float32
时,舍入按预期工作。
float32_df = df[price_cols].apply(lambda x: pd.to_numeric(x, downcast='float'))
>>> float32_df.dtypes
price1 float32
price2 float32
dtype: object
>>> float32_df[price_cols].apply(lambda x: x.round(5))
price1 price2
0 0.06606 0.06606
Any ideas why the rounding doesn't work properly when both columns are of type float64
?当两列都是float64
类型时,为什么舍入不能正常工作的任何想法?
Printing the floats with higher precision shows that pd.to_numeric
converted '.066055'
to 0.06605499999999998872
.以更高的精度打印浮点数表明pd.to_numeric
将'.066055'
转换为0.06605499999999998872
。
with pd.option_context('display.float_format', '{:0.20f}'.format):
print(float64_df)
Output: Output:
price1 price2
0 0.06605500000000000260 0.06605499999999998872
The short answer is pd.to_numeric
outputs different values for the two:简短的回答是pd.to_numeric
为两者输出不同的值:
pd.to_numeric(0.066055)
pd.to_numeric('0.066055')
# 0.066055
# 0.06605499999999999
In the case of 0.066055
, it simply returns the value .在0.066055
的情况下,它只是返回值。
In the case of '0.066055'
, I believe it uses this function for converting the string to a float.在'0.066055'
的情况下,我相信它使用这个 function将字符串转换为浮点数。
This answer may also be helpful.这个答案也可能会有所帮助。
Getting exact numbers with floats is somewhat impossible and floats are always somewhat unpredictable.用浮点数获得准确的数字有点不可能,而且浮点数总是有点不可预测。 My guess is that the object results in a float64 a little bit smaller than the original number eg 0.066054999999999999 or something similar, resulting in the unexpected rounding result.我的猜测是 object 导致 float64 比原始数字小一点,例如 0.066054999999999999 或类似的东西,导致意外的舍入结果。
Python has some documentation about this. Python 有一些关于此的文档。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.