简体   繁体   中英

Instability of pandas dataframe calculations

I'm wondering whether anyone has seen this problem with Pandas before. Basically, I'm trying to add, multiply, and divide DataFrames element-by-element (all the frames have identical indexes and columns), but Pandas is spitting out different results for the same calculation performed successively.

An image of some example output is shown below. I've used .values in the code below because for display purposes, but the instability can happen when using .add() , .mul() , or .div() . For example, if I repeatedly enter N11.add(N00) , I usually get the correct answer, but occasionally (every 4th or 5th time), I get a DataFrame filled with 0s.

在此处输入图片说明

If it matters, I'm on Windows 10 using an Anaconda distribution of Pandas 0.17.0 (with Python 2.7.10 on Spyder 2.3.7). The frames that I am working with are large (6856 by 12511). Has anyone else encountered this problem? Is this a known issue or am I doing something wrong?

I encountered a similar issue today and it was caused by a bug in numexpr 2.4.4 . It seems to be biting other pandas users in various ways, as reported in this pandas ticket and others linked to it.

Upgrading numexpr to 2.4.6 solved the problem for us, but it looks like any version that's not 2.4.4 should be fine!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM