简体   繁体   中英

Pandas gets wrong result when <adding then comparing> numbers, python

When I need to drop the rows which have "sum of all probability value (at top 10 digits float) greater than 1" in my dataframe, pandas gave me wrong results.

My code:

# drop wrong probability row
data.at[data[data.p1 + data.p2 + data.p3 > 1.001].index, 'h1'] = 'dropped by pandas'

The results:

re______________ | p1________   | p2________ | p3________ | sump

correct result_____ | 0.743088844   | 0.24208727    | 0.014823886   | 1 << correct
correct result_____ | 0.647239626   | 0.346835025   | 0.00592535    | 1 << correct
correct result_____ | 0.65043824    | 0.34372226    | 0.0058395 | 1 << correct
correct result_____ | 0.75111312    | 0.221604341   | 0.027282539   | 1  << correct
dropped by pandas   | 0.670277591   | 0.324265434   | 0.005456975   | 1  << wrong
dropped by pandas   | 0.672221755   | 0.322438072   | 0.005340173   | 1  << wrong
dropped by pandas   | 0.670053332   | 0.324742569   | 0.005204099   | 1  << wrong
dropped by pandas   | 0.667690433   | 0.327033634   | 0.005275932   | 1  << wrong
dropped by pandas   | 0.237037933   | 0.823248091   | 0.05335034    | 1.113636364  << correct
dropped by pandas   | 0.242720919   | 0.818282268   | 0.052633177   | 1.113636364  << correct

More clear image results:

results in Excel

It seems like sometimes it will work but sometimes doesn't, which drives me crazy...

(I tried to set the precision to 16 but I found that only affects the display number.)

you're adding all the results and then comparing them with one, without dividing them by 3 or comparing them with 3. simply change data.at[data[data.p1 + data.p2 + data.p3 > 1.001].index, 'h1'] = 'dropped by pandas' to data.at[data[data.p1 + data.p2 + data.p3 > 3].index, 'h1'] = 'dropped by pandas' or data.at[data[data.p1 + data.p2 + data.p3/3 > 1].index, 'h1'] = 'dropped by pandas' . also, you don't need to compare them with 1.001, you can compare them with 1, because the > function is more than not >=, which is more than or equal.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM