[英]Filter a dataframe based groupby multiple condition
我有一個 dataframe 如下所示
name skill score percentage
messi attack 160 80
messi fitness 10 5
messi pass 30 15
neymar attack 48 60
neymar fitness 20 25
neymar pass 12 15
ronaldo attack 60 60
ronaldo fitness 30 30
ronaldo pass 10 10
casilas attack 10 25
casilas fitness 20 50
casilas pass 10 25
owen attack 20 20
owen fitness 70 70
owen pass 10 10
從上面的 dataframe 我想過濾attack
score
more than 50
和attack
percentage
more than 50
的name
。
預計 output:
name skill score percentage
messi attack 160 80
messi fitness 10 5
messi pass 30 15
ronaldo attack 60 60
ronaldo fitness 30 30
ronaldo pass 10 10
你不需要groupby,你可以使用boolean掩碼
mask = df['skill'].eq('attack') & df['score'].gt(50) & df['percentage'].gt(50)
out = df[df['name'].isin(df.loc[mask, 'name'])]
print(out)
name skill score percentage
0 messi attack 160 80
1 messi fitness 10 5
2 messi pass 30 15
6 ronaldo attack 60 60
7 ronaldo fitness 30 30
8 ronaldo pass 10 10
一種方法
import io
str_data="""
name,skill,score,percentage
messi,attack,160,80
messi,fitness,10,5
messi,pass,30,15
neymar,attack,48,60
neymar,fitness,20,25
neymar,pass,12,15
ronaldo,attack,60,60
ronaldo,fitness,30,30
ronaldo,pass,10,10
casilas,attack,10,25
casilas,fitness,20,50
casilas,pass,10,25
"""
df = pd.read_csv(io.StringIO(str_data))
def filt_player(player_df):
player_df = player_df.set_index('skill')
filters = (
player_df.loc['attack','score'] > 50,
player_df.loc['attack','percentage'] > 50,
)
return all(filters)
filt_df = df.groupby('name').filter(filt_player)
filt_df
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.