简体   繁体   English

舍入差异 - float64 与 float32

[英]Difference in rounding - float64 vs. float32

This scenario is a simplification of an ETL scenario involving multiple sets of data pulled from MySQL tables.此场景是 ETL 场景的简化,涉及从 MySQL 表中提取的多组数据。 I have a merged dataframe where one price column is type float64 and the other is type object .我有一个合并的 dataframe ,其中一个价格列是float64类型,另一个是object类型。

import pandas as pd

df = pd.DataFrame({
    'price1': [0.066055],
    'price2': ['0.066055'],
})

>>> df.dtypes
price1    float64
price2     object
dtype: object

When these two columns are converted to float64 , the column price1 is rounded incorrectly when rounded to 5 digits.当这两列转换为float64时,列price1在四舍五入为 5 位时被错误地四舍五入。

float64_df = df[price_cols].apply(lambda x: pd.to_numeric(x))

>>> float64_df.dtypes
price1    float64
price2    float64
dtype: object

>>> float64_df[price_cols].apply(lambda x: x.round(5))
    price1   price2
0  0.06606  0.06605

However, when the columns are converted to float32 using downcast='float' , the rounding works as expected.但是,当使用downcast='float'将列转换为float32时,舍入按预期工作。

float32_df = df[price_cols].apply(lambda x: pd.to_numeric(x, downcast='float'))

>>> float32_df.dtypes
price1    float32
price2    float32
dtype: object

>>> float32_df[price_cols].apply(lambda x: x.round(5))
    price1   price2
0  0.06606  0.06606

Any ideas why the rounding doesn't work properly when both columns are of type float64 ?当两列都是float64类型时,为什么舍入不能正常工作的任何想法?

Printing the floats with higher precision shows that pd.to_numeric converted '.066055' to 0.06605499999999998872 .以更高的精度打印浮点数表明pd.to_numeric'.066055'转换为0.06605499999999998872

with pd.option_context('display.float_format', '{:0.20f}'.format):
    print(float64_df)

Output: Output:

                  price1                 price2
0 0.06605500000000000260 0.06605499999999998872

The short answer is pd.to_numeric outputs different values for the two:简短的回答是pd.to_numeric为两者输出不同的值:

pd.to_numeric(0.066055)
pd.to_numeric('0.066055')

# 0.066055
# 0.06605499999999999

In the case of 0.066055 , it simply returns the value .0.066055的情况下,它只是返回值

In the case of '0.066055' , I believe it uses this function for converting the string to a float.'0.066055'的情况下,我相信它使用这个 function将字符串转换为浮点数。

This answer may also be helpful.这个答案也可能会有所帮助。

Getting exact numbers with floats is somewhat impossible and floats are always somewhat unpredictable.用浮点数获得准确的数字有点不可能,而且浮点数总是有点不可预测。 My guess is that the object results in a float64 a little bit smaller than the original number eg 0.066054999999999999 or something similar, resulting in the unexpected rounding result.我的猜测是 object 导致 float64 比原始数字小一点,例如 0.066054999999999999 或类似的东西,导致意外的舍入结果。

Python has some documentation about this. Python 有一些关于此的文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM