简体   繁体   中英

use the comparison result as the index of pandas.DataFrame

import pandas as pd
import numpy as np

df = pd.DataFrame([[1,2,3],[4,np.nan,6]])
whereNans = np.isnan(df)
print whereNans
print df[whereNans]

print "--"*30

print df>3
print df[df>3]

As above, whereNans is correct, but df[whereNans] doesn't get what I want. However, df[df>3] can get what I want.

Actually, the index stored in whereNans is same as the df>3 . What is the problem?

You seem to be confused by this, this is correct behaviour, where the mask is True it will display the result in that position, where False it will display NaN , so in effect you're going to display a df with all NaN s

Because you have a single NaN value it returns NaN for that position, where it's False you just get NaN

If you compare with df>3 version you observe the same behaviour:

In[49]:
df[df>3]

Out[49]: 
     0   1    2
0  NaN NaN  NaN
1  4.0 NaN  6.0

Also just to show this has nothing to do with numpy , using pandas isnull gives the same result:

In[50]:
df[df.isnull()]

Out[50]: 
    0   1   2
0 NaN NaN NaN
1 NaN NaN NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM