简体   繁体   中英

selecting same data from multiple columns in pandas Dataframe

I have the following data,

name    marks   team1   team2
John    30  kk  vv.
Sera    56  gg  ww.
Saara   66  dd  gg.
Dirg    33  rr  dd.
maget   34  ff  rr.
fared   56  vv  ff.
juile   32  ww  kk.

I need to have general formula to get records row wise, which have 'kk' in team 1 and team 2 This is just a sample data, I have actual data which is more than 100k rows.

Use boolean indexing with mask created by filter for get all columns with team , compare by eq ( == ) and get at least one True per rows by any :

df = df[df.filter(like='team').eq('kk').any(axis=1)]
#if want select columns by names
#df = df[df[['team1','team2']].eq('kk').any(axis=1)]

For better performance use numpy.any :

df = df[np.any(df.filter(like='team').values == 'kk', axis=1)]

print (df)
    name  marks team1 team2
0   John     30    kk    vv
6  juile     32    ww    kk

Details:

print (df.filter(like='team').eq('kk'))
   team1  team2
0   True  False
1  False  False
2  False  False
3  False  False
4  False  False
5  False  False
6  False   True

print (df.filter(like='team').eq('kk').any(axis=1))
0     True
1    False
2    False
3    False
4    False
5    False
6     True
dtype: bool

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM