简体   繁体   中英

Difference in rounding - float64 vs. float32

This scenario is a simplification of an ETL scenario involving multiple sets of data pulled from MySQL tables. I have a merged dataframe where one price column is type float64 and the other is type object .

import pandas as pd

df = pd.DataFrame({
    'price1': [0.066055],
    'price2': ['0.066055'],
})

>>> df.dtypes
price1    float64
price2     object
dtype: object

When these two columns are converted to float64 , the column price1 is rounded incorrectly when rounded to 5 digits.

float64_df = df[price_cols].apply(lambda x: pd.to_numeric(x))

>>> float64_df.dtypes
price1    float64
price2    float64
dtype: object

>>> float64_df[price_cols].apply(lambda x: x.round(5))
    price1   price2
0  0.06606  0.06605

However, when the columns are converted to float32 using downcast='float' , the rounding works as expected.

float32_df = df[price_cols].apply(lambda x: pd.to_numeric(x, downcast='float'))

>>> float32_df.dtypes
price1    float32
price2    float32
dtype: object

>>> float32_df[price_cols].apply(lambda x: x.round(5))
    price1   price2
0  0.06606  0.06606

Any ideas why the rounding doesn't work properly when both columns are of type float64 ?

Printing the floats with higher precision shows that pd.to_numeric converted '.066055' to 0.06605499999999998872 .

with pd.option_context('display.float_format', '{:0.20f}'.format):
    print(float64_df)

Output:

                  price1                 price2
0 0.06605500000000000260 0.06605499999999998872

The short answer is pd.to_numeric outputs different values for the two:

pd.to_numeric(0.066055)
pd.to_numeric('0.066055')

# 0.066055
# 0.06605499999999999

In the case of 0.066055 , it simply returns the value .

In the case of '0.066055' , I believe it uses this function for converting the string to a float.

This answer may also be helpful.

Getting exact numbers with floats is somewhat impossible and floats are always somewhat unpredictable. My guess is that the object results in a float64 a little bit smaller than the original number eg 0.066054999999999999 or something similar, resulting in the unexpected rounding result.

Python has some documentation about this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM