简体   繁体   中英

Compare one row of a dataframe with rows of other dataframe?

I have two dataframes say df and thresh_df . The shape of df is say 1000*200 and thresh_df is 1*200 .

I need to compare the thresh_df row with each row of df element wise respectively and I have to fetch the corresponding column number whose values are less than the values of thresh_df .

I tried the following

compared_df = df.apply(lambda x : np.where(x < thresh_df.values))

But I get an empty dataframe! If question is unclear and need any explanations,please let me know in the comments.

I think apply is not necessary only compare one row DataFrame converted to Series by selecting first row:

df = pd.DataFrame({

         'B':[4,5,4,5,5,4],
         'C':[7,8,9,4,2,3],
         'D':[1,3,5,7,1,0],
         'E':[5,3,6,9,2,4],

})

thresh_df = pd.DataFrame({

         'B':[4],
         'C':[7],
         'D':[4],
         'E':[5],

})

compared_df  = df < thresh_df.iloc[0]
print (compared_df)
       B      C      D      E
0  False  False   True  False
1  False  False   True   True
2  False  False  False  False
3  False   True  False  False
4  False   True   True   True
5  False   True   True   True

Then use DataFrame.any for filter at least one True per row and filter index values:

idx = df.index[compared_df.any(axis=1)]
print (idx)
Int64Index([0, 1, 3, 4, 5], dtype='int64')

Detail :

print (compared_df.any(axis=1))
0     True
1     True
2    False
3     True
4     True
5     True
dtype: bool

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM