[英]Filter a dataframe based groupby multiple condition
我有一个 dataframe 如下所示
name skill score percentage
messi attack 160 80
messi fitness 10 5
messi pass 30 15
neymar attack 48 60
neymar fitness 20 25
neymar pass 12 15
ronaldo attack 60 60
ronaldo fitness 30 30
ronaldo pass 10 10
casilas attack 10 25
casilas fitness 20 50
casilas pass 10 25
owen attack 20 20
owen fitness 70 70
owen pass 10 10
从上面的 dataframe 我想过滤attack
score
more than 50
和attack
percentage
more than 50
的name
。
预计 output:
name skill score percentage
messi attack 160 80
messi fitness 10 5
messi pass 30 15
ronaldo attack 60 60
ronaldo fitness 30 30
ronaldo pass 10 10
你不需要groupby,你可以使用boolean掩码
mask = df['skill'].eq('attack') & df['score'].gt(50) & df['percentage'].gt(50)
out = df[df['name'].isin(df.loc[mask, 'name'])]
print(out)
name skill score percentage
0 messi attack 160 80
1 messi fitness 10 5
2 messi pass 30 15
6 ronaldo attack 60 60
7 ronaldo fitness 30 30
8 ronaldo pass 10 10
一种方法
import io
str_data="""
name,skill,score,percentage
messi,attack,160,80
messi,fitness,10,5
messi,pass,30,15
neymar,attack,48,60
neymar,fitness,20,25
neymar,pass,12,15
ronaldo,attack,60,60
ronaldo,fitness,30,30
ronaldo,pass,10,10
casilas,attack,10,25
casilas,fitness,20,50
casilas,pass,10,25
"""
df = pd.read_csv(io.StringIO(str_data))
def filt_player(player_df):
player_df = player_df.set_index('skill')
filters = (
player_df.loc['attack','score'] > 50,
player_df.loc['attack','percentage'] > 50,
)
return all(filters)
filt_df = df.groupby('name').filter(filt_player)
filt_df
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.