簡體   English   中英

根據多個條件過濾一個基於 dataframe 的組

[英]Filter a dataframe based groupby multiple condition

我有一個 dataframe 如下所示

name        skill          score      percentage
messi       attack         160        80
messi       fitness        10         5
messi       pass           30         15
neymar      attack         48         60
neymar      fitness        20         25
neymar      pass           12         15
ronaldo     attack         60         60
ronaldo     fitness        30         30
ronaldo     pass           10         10
casilas     attack         10         25
casilas     fitness        20         50
casilas     pass           10         25
owen        attack         20         20
owen        fitness        70         70
owen        pass           10         10

從上面的 dataframe 我想過濾attack score more than 50attack percentage more than 50name

預計 output:

name        skill          score      percentage
messi       attack         160        80
messi       fitness        10         5
messi       pass           30         15
ronaldo     attack         60         60
ronaldo     fitness        30         30
ronaldo     pass           10         10

你不需要groupby,你可以使用boolean掩碼

mask = df['skill'].eq('attack') & df['score'].gt(50) & df['percentage'].gt(50)
out = df[df['name'].isin(df.loc[mask, 'name'])]

print(out)

      name    skill  score  percentage
0    messi   attack    160          80
1    messi  fitness     10           5
2    messi     pass     30          15
6  ronaldo   attack     60          60
7  ronaldo  fitness     30          30
8  ronaldo     pass     10          10

一種方法

import io
str_data="""
name,skill,score,percentage
messi,attack,160,80
messi,fitness,10,5
messi,pass,30,15
neymar,attack,48,60
neymar,fitness,20,25
neymar,pass,12,15
ronaldo,attack,60,60
ronaldo,fitness,30,30
ronaldo,pass,10,10
casilas,attack,10,25
casilas,fitness,20,50
casilas,pass,10,25
"""

df = pd.read_csv(io.StringIO(str_data))


def filt_player(player_df):
    player_df = player_df.set_index('skill')
    
    filters = (
        player_df.loc['attack','score'] > 50,
        player_df.loc['attack','percentage'] > 50,
    )
        
    return all(filters)


filt_df = df.groupby('name').filter(filt_player)

filt_df

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM