简体   繁体   English

对熊猫数据框列使用条件if / else逻辑

[英]Using conditional if/else logic with pandas dataframe columns

My dataframe called pw2 looks something like this, where I have two columns, pw1 and pw2, which are probability of wins. 我的数据pw2看起来像这样,其中有两列pw1和pw2,这是获胜的概率。 I'd like to perform some conditional logic to create another column called WINNER based off pw1 and pw2 . 我想执行一些条件逻辑,根据pw1pw2创建另一个名为WINNER的列。

+-------------------------+-------------+-----------+-------------+
|          Name1          |     pw1     |   Name2   |     pw2     |
+-------------------------+-------------+-----------+-------------+
| Seaking                 | 0.517184213 | Lickitung | 0.189236181 |
| Ferrothorn              | 0.172510623 | Quagsire  | 0.260884258 |
| Thundurus Therian Forme | 0.772536272 | Hitmonlee | 0.694069408 |
| Flaaffy                 | 0.28681284  | NaN       | NaN         |
+-------------------------+-------------+-----------+-------------+

I want to do this conditionally in a function but I'm having some trouble. 我想有条件地在函数中执行此操作,但是遇到了一些麻烦。

  • if pw1 > pw2 , populate with Name1 如果pw1 > pw2 ,填充Name1
  • if pw2 > pw1 , populate with Name2 如果pw2 > pw1 ,则填充Name2
  • if pw1 is populated but pw2 isn't, populate with Name1 如果pw1填充,但pw2不,填充Name1
  • if pw2 is populated but pw1 isn't, populate with Name2 如果已填充pw2填充pw1则使用Name2填充

But my function isn't working - for some reason checking if a value is null isn't working. 但是我的函数无法正常工作-由于某种原因,检查值是否为null无效。

def final_winner(df):
    # If PW1 is missing and PW2 is populated, Pokemon 1 wins
    if df['pw1'] = None and df['pw2'] != None:
        return df['Number1']
    # If it's the same thing but the other way around, Pokemon 2 wins
    elif df['pw2'] = None and df['pw1'] != None:
        return df['Number2']
    # If pw2 is greater than pw1, then Pokemon 2 wins
    elif df['pw2'] > df['pw1']:
        return df['Number2']
    else
        return df['Number1']

pw2['Winner'] = pw2.apply(final_winner, axis=1)

Do not use apply , which is very slow. 不要使用apply ,这非常慢。 Use np.where 使用np.where

pw2 = df.pw2.fillna(-np.inf)
df['winner'] = np.where(df.pw1 > pw2, df.Name1, df.Name2)

Once NaN s always lose, can just fillna() it with -np.inf to yield same logic. 一旦NaN总是丢失,可以用-np.inf进行fillna()以产生相同的逻辑。


Looking at your code, we can point out several problems. 查看您的代码,我们可以指出几个问题。 First, you are comparing df['pw1'] = None , which is invalid python syntax for comparison. 首先,您正在比较df['pw1'] = None ,这是用于比较的无效python语法。 You usually want to compare things using == operator. 您通常希望使用==运算符进行比较。 However, for None , it is recommended to use is , such as if variable is None: (...) . 但是,对于None ,建议使用is ,例如, if variable is None: (...) However again, you are in a pandas/numpy environment, where there actually several values for null values ( None , NaN , NaT , etc). 但是,同样,您处于pandas/numpy环境中,其中实际上有多个空值( NoneNaNNaT等)。

So, it is preferable to check for nullability using pd.isnull() or df.isnull() . 因此,最好使用pd.isnull()df.isnull()检查可为空性。

Just to illustrate, this is how your code should look like: 只是为了说明,这就是您的代码应如下所示:

def final_winner(df):
    if pd.isnull(df['pw1']) and not pd.isnull(df['pw2']):
        return df['Name1']
    elif pd.isnull(df['pw2']) and not pd.isnull(df['pw1']):
        return df['Name1']
    elif df['pw2'] > df['pw1']:
        return df['Name2']
    else:
        return df['Name1']

df['winner'] = df.apply(final_winner, axis=1)

But again, definitely use np.where . 但是同样,绝对要使用np.where

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM