繁体   English   中英

根据多个条件过滤一个基于 dataframe 的组

[英]Filter a dataframe based groupby multiple condition

我有一个 dataframe 如下所示

name        skill          score      percentage
messi       attack         160        80
messi       fitness        10         5
messi       pass           30         15
neymar      attack         48         60
neymar      fitness        20         25
neymar      pass           12         15
ronaldo     attack         60         60
ronaldo     fitness        30         30
ronaldo     pass           10         10
casilas     attack         10         25
casilas     fitness        20         50
casilas     pass           10         25
owen        attack         20         20
owen        fitness        70         70
owen        pass           10         10

从上面的 dataframe 我想过滤attack score more than 50attack percentage more than 50name

预计 output:

name        skill          score      percentage
messi       attack         160        80
messi       fitness        10         5
messi       pass           30         15
ronaldo     attack         60         60
ronaldo     fitness        30         30
ronaldo     pass           10         10

你不需要groupby,你可以使用boolean掩码

mask = df['skill'].eq('attack') & df['score'].gt(50) & df['percentage'].gt(50)
out = df[df['name'].isin(df.loc[mask, 'name'])]

print(out)

      name    skill  score  percentage
0    messi   attack    160          80
1    messi  fitness     10           5
2    messi     pass     30          15
6  ronaldo   attack     60          60
7  ronaldo  fitness     30          30
8  ronaldo     pass     10          10

一种方法

import io
str_data="""
name,skill,score,percentage
messi,attack,160,80
messi,fitness,10,5
messi,pass,30,15
neymar,attack,48,60
neymar,fitness,20,25
neymar,pass,12,15
ronaldo,attack,60,60
ronaldo,fitness,30,30
ronaldo,pass,10,10
casilas,attack,10,25
casilas,fitness,20,50
casilas,pass,10,25
"""

df = pd.read_csv(io.StringIO(str_data))


def filt_player(player_df):
    player_df = player_df.set_index('skill')
    
    filters = (
        player_df.loc['attack','score'] > 50,
        player_df.loc['attack','percentage'] > 50,
    )
        
    return all(filters)


filt_df = df.groupby('name').filter(filt_player)

filt_df

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM