对熊猫数据框列使用条件if / else逻辑

Question

My dataframe called pw2 looks something like this, where I have two columns, pw1 and pw2, which are probability of wins. 我的数据pw2看起来像这样，其中有两列pw1和pw2，这是获胜的概率。 I'd like to perform some conditional logic to create another column called WINNER based off pw1 and pw2 . 我想执行一些条件逻辑，根据pw1和pw2创建另一个名为WINNER的列。

+-------------------------+-------------+-----------+-------------+
|          Name1          |     pw1     |   Name2   |     pw2     |
+-------------------------+-------------+-----------+-------------+
| Seaking                 | 0.517184213 | Lickitung | 0.189236181 |
| Ferrothorn              | 0.172510623 | Quagsire  | 0.260884258 |
| Thundurus Therian Forme | 0.772536272 | Hitmonlee | 0.694069408 |
| Flaaffy                 | 0.28681284  | NaN       | NaN         |
+-------------------------+-------------+-----------+-------------+

I want to do this conditionally in a function but I'm having some trouble. 我想有条件地在函数中执行此操作，但是遇到了一些麻烦。

if pw1 > pw2 , populate with Name1 如果pw1 > pw2 ，填充Name1
if pw2 > pw1 , populate with Name2 如果pw2 > pw1 ，则填充Name2
if pw1 is populated but pw2 isn't, populate with Name1 如果pw1填充，但pw2不，填充Name1
if pw2 is populated but pw1 isn't, populate with Name2 如果已填充pw2填充pw1则使用Name2填充

But my function isn't working - for some reason checking if a value is null isn't working. 但是我的函数无法正常工作-由于某种原因，检查值是否为null无效。

def final_winner(df):
    # If PW1 is missing and PW2 is populated, Pokemon 1 wins
    if df['pw1'] = None and df['pw2'] != None:
        return df['Number1']
    # If it's the same thing but the other way around, Pokemon 2 wins
    elif df['pw2'] = None and df['pw1'] != None:
        return df['Number2']
    # If pw2 is greater than pw1, then Pokemon 2 wins
    elif df['pw2'] > df['pw1']:
        return df['Number2']
    else
        return df['Number1']

pw2['Winner'] = pw2.apply(final_winner, axis=1)

Answer 1

Do not use apply , which is very slow. 不要使用apply ，这非常慢。 Use np.where 使用np.where

pw2 = df.pw2.fillna(-np.inf)
df['winner'] = np.where(df.pw1 > pw2, df.Name1, df.Name2)

Once NaN s always lose, can just fillna() it with -np.inf to yield same logic. 一旦NaN总是丢失，可以用-np.inf进行fillna()以产生相同的逻辑。

Looking at your code, we can point out several problems. 查看您的代码，我们可以指出几个问题。 First, you are comparing df['pw1'] = None , which is invalid python syntax for comparison. 首先，您正在比较df['pw1'] = None ，这是用于比较的无效python语法。 You usually want to compare things using == operator. 您通常希望使用==运算符进行比较。 However, for None , it is recommended to use is , such as if variable is None: (...) . 但是，对于None ，建议使用is ，例如， if variable is None: (...) 。 However again, you are in a pandas/numpy environment, where there actually several values for null values ( None , NaN , NaT , etc). 但是，同样，您处于pandas/numpy环境中，其中实际上有多个空值（ None ， NaN ， NaT等）。

So, it is preferable to check for nullability using pd.isnull() or df.isnull() . 因此，最好使用pd.isnull()或df.isnull()检查可为空性。

Just to illustrate, this is how your code should look like: 只是为了说明，这就是您的代码应如下所示：

def final_winner(df):
    if pd.isnull(df['pw1']) and not pd.isnull(df['pw2']):
        return df['Name1']
    elif pd.isnull(df['pw2']) and not pd.isnull(df['pw1']):
        return df['Name1']
    elif df['pw2'] > df['pw1']:
        return df['Name2']
    else:
        return df['Name1']

df['winner'] = df.apply(final_winner, axis=1)

But again, definitely use np.where . 但是同样，绝对要使用np.where 。

对熊猫数据框列使用条件if / else逻辑

问题描述

1 个解决方案

解决方案1
5 已采纳 2018-09-22 14:45:53

对熊猫数据框列使用条件if / else逻辑

问题描述

1 个解决方案

解决方案1 5 已采纳 2018-09-22 14:45:53

解决方案1
5 已采纳 2018-09-22 14:45:53