简体   繁体   中英

select columns based on row values of another dataset

So I have two dataframes dfA and dfB . I want to select several columns of dfA based on the rows in dfB . This is how my dfA looks like:

index    abandoned     dismiss     yes      train    tram    go  
0          0.5           9.1       1.4       2.5      2.5    5.6
1          2.4           3.2       1.8       4.9      9.3    3.2
2          1.5           5.7       3.9       2.1      1.1    0.9

and this is how dfB looks like:

index   keywords
0       abandoned
1       wanted
2       goes
3       train
4       bold
5       go
6       images
7       links

so I want my dfC looks like this:

index   abandoned   train    go
0        0.5         2.5     5.6
1        2.4         4.9     3.2 
2        1.5         2.1     0.9

This was my attempt, but it gave me null dataframe:

dfC= dfB[~dfB["keywords"].isin(dfA)]

can anyone help me? thank you

Use DataFrame.loc with filter columns names by Index.isin :

dfC = dfA.loc[:, dfA.columns.isin(dfB['keywords'])]

Or filtering by Index.intersection :

dfC = dfA[dfA.columns.intersection(dfB['keywords'])]

print (dfC)
       abandoned  train   go
index                       
0            0.5    2.5  5.6
1            2.4    4.9  3.2
2            1.5    2.1  0.9

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM