简体   繁体   English

使用比较结果作为pandas.DataFrame的索引

[英]use the comparison result as the index of pandas.DataFrame

import pandas as pd
import numpy as np

df = pd.DataFrame([[1,2,3],[4,np.nan,6]])
whereNans = np.isnan(df)
print whereNans
print df[whereNans]

print "--"*30

print df>3
print df[df>3]

As above, whereNans is correct, but df[whereNans] doesn't get what I want. 如上所述, whereNans是正确的,但df[whereNans]没有得到我想要的。 However, df[df>3] can get what I want. 但是, df[df>3]可以得到我想要的。

Actually, the index stored in whereNans is same as the df>3 . 实际上,存储在whereNans的索引与df>3相同。 What is the problem? 问题是什么?

You seem to be confused by this, this is correct behaviour, where the mask is True it will display the result in that position, where False it will display NaN , so in effect you're going to display a df with all NaN s 你似乎对此感到困惑,这是正确的行为,其中掩码是True它将在该位置显示结果,其中False它将显示NaN ,所以实际上你将显示一个包含所有NaN的df

Because you have a single NaN value it returns NaN for that position, where it's False you just get NaN 因为你有一个NaN值,它会为该位置返回NaN ,如果它为False,你就得到NaN

If you compare with df>3 version you observe the same behaviour: 如果您与df>3版本进行比较,您会发现相同的行为:

In[49]:
df[df>3]

Out[49]: 
     0   1    2
0  NaN NaN  NaN
1  4.0 NaN  6.0

Also just to show this has nothing to do with numpy , using pandas isnull gives the same result: 也只是为了表明这与numpy无关,使用pandas isnull会得到相同的结果:

In[50]:
df[df.isnull()]

Out[50]: 
    0   1   2
0 NaN NaN NaN
1 NaN NaN NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM