對 dataframe 列應用多個條件

Question

我正在使用團隊結果的 pandas dataframe ：

    Team Home/Away  Home_Score   Away_Score
0   ABC   Home          2            3
1   ABC   Home          1            2
2   ABC   Away          1            3
3   ABC   Away          0            1

我想創建一個名為“結果”的新列，它根據上述結果以及相關球隊是主場還是客場比賽返回勝利、失敗或平局。 I'm trying to use where() function from numpy within a function but it's not applying the numpy part, only the first part which checks if the team is Home or Away. 下面是我的 function 和 lambda 語句：


def result(x):    
    for score in df['Home/Away']:
        
        #Home Wins
        if x == 'Home' and np.where(df['Home_Score'] > df['Away_Score']):
            return 'Win'
        
        #Home Losses
        elif x == 'Home' and np.where(df['Home_Score'] < df['Away_Score']):
            return 'Loss'

        #Away Wins
        elif x == 'Away' and np.where(df['Home_Score'] < df['Away_Score']):
            return 'Win'
        
        #Away Losses
        elif x == 'Away' and np.where(df['Home_Score'] > df['Away_Score']):
            return 'Loss'
        
        #Draws
        elif np.where(df['Home_Score'] == df['Away_Score']):
            return 'Draw'
        
df['Result'] = df.apply(lambda x: result(x['Home/Away']), axis=1)

我不確定如何讓它也讀取 Home_Score 和 Away_Score 列並應用 np.where function - 我認為將它們添加到 if 語句中就足夠了，但它不起作用。 例如，當我預期的 output 的結果是 Loss，Loss，Win，Win 時，上面的代碼返回 Win、Win、Win、Win。 任何幫助，將不勝感激。

Answer 1

我個人會使用np.select()給你更多的控制和可讀性

condition_list = [
    (df['Home/Away'] == 'Home') & (df['Home_Score'] > df['Away_Score']),
    (df['Home/Away'] == 'Home') & (df['Home_Score'] < df['Away_Score']),
    (df['Home/Away'] == 'Away') & (df['Home_Score'] < df['Away_Score']),
    (df['Home/Away'] == 'Away') & (df['Home_Score'] > df['Away_Score']),
]

choice_list = [
    'Win',
    'Lose',
    'Win',
    'Lose'
]

df['Results'] = np.select(condition_list, choice_list, 'Draw')
df

Answer 2

更直接的方法可能是：

計算分數差異的符號
如果離開，則乘以 -1
如果 -1 -> 輸，如果 0 -> 平局，如果 1 -> 贏

df['Result'] = (
 np.sign(df['Home_Score'].sub(df['Away_Score']))
   .mul(df['Home/Away'].map({'Home': 1, 'Away': -1}))
   .map({1: 'Win', 0: 'Draw', -1: 'Loss'})
 )

或者，如果您想使用numpy.select ，您可以將邏輯簡化為 2 個條件：

如果分數相等 -> 平局
如果 boolean Home_Score>Away_Score 等於 boolean Home/Away == Home -> Win
- Home and Home_Score>Away_Score -> Win
- not Home and not Home_Score>Away_Score -> 也贏
其他損失

c1 = df['Home_Score'].eq(df['Away_Score'])
c2 = df['Home/Away'].eq('Home')
c3 = df['Home_Score'].gt(df['Away_Score'])
df['Result'] = np.select([c1, c2==c3], ['Draw', 'Win'], 'Loss')

Output：

  Team Home/Away  Home_Score  Away_Score Result
0  ABC      Home           2           3   Loss
1  ABC      Home           1           2   Loss
2  ABC      Away           1           3    Win
3  ABC      Away           0           1    Win

顯示所有可能性的其他示例：

  Team Home/Away  Home_Score  Away_Score Result
0  ABC      Home           2           3   Loss
1  ABC      Home           5           2    Win
2  ABC      Away           1           3    Win
3  ABC      Away           2           1   Loss
4  ABC      Home           2           2   Draw
5  ABC      Away           1           1   Draw

對 dataframe 列應用多個條件

問題描述

2 個解決方案

解決方案1
3 已采納 2022-09-20 19:11:36

解決方案2
2 2022-09-20 20:14:42

對 dataframe 列應用多個條件

問題描述

2 個解決方案

解決方案1 3 已采納 2022-09-20 19:11:36

解決方案2 2 2022-09-20 20:14:42

解決方案1
3 已采納 2022-09-20 19:11:36

解決方案2
2 2022-09-20 20:14:42