简体   繁体   English

比较数据框值错误中的 2 列

[英]Comparing 2 columns in data-frame Value Error

I have a data-frame df1 which looks like:我有一个数据框df1 ,它看起来像:

  ID  myColA  myColB myColC
0  A       1       5     13
1  B      -2       6     14
2  C       3      -7     15
3  D       4       8     16

I am trying to add a new column myColD that is filled on the result of the following:我正在尝试添加一个新列myColD ,该列由以下结果填充:

myColD = ((myColA > 0 and myColB <0) or (myColA < 0 and myColB > 0)),0,myColA)

where row value in myColA is above 0 and value in myColB is below 0 or vice versa return 0, else return the myColA value.其中myColA中的行值大于 0 且myColB中的值小于 0,反之亦然返回 0,否则返回myColA值。

So my desired output would be:所以我想要的 output 是:

  ID  myColA  myColB myColC myColD
0  A       1       5     13      1
1  B      -2       6     14      0  
2  C       3      -7     15      0
3  D       4       8     16      4

Here is my code:这是我的代码:

df1 = pd.DataFrame({'ID': ['A', 'B', 'C', 'D'],
    'myColA': [1, -2, 3, 4],
    'myColB': [5, 6, -7, 8],
    'myColC': [9, 10, 11, 12]},
     index=[0, 1, 2, 3])

df1['myColD'] = np.where(((df1.myColA > 0) & (df1.myColB < 0)) or ((df1.myColA < 0) & (df1.myColB > 0)), df1.myColA, 0)

I am however getting a Value Error:然而,我收到了一个值错误:

ValueError: The truth value of a Series is ambiguous. ValueError:Series 的真值不明确。 Use a.empty, a.bool(), a.item(), a.any() or a.all().使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。

You need bitwise operators for this.为此,您需要按位运算符 So use a bitwise or : |所以使用bitwise or| instead of or .而不是or

Also note that you could simplify this by checking where the product between both columns results in a negative value, and set the corresponding values to 0 with mask :另请注意,您可以通过检查两列之间的乘积在何处产生负值来简化此操作,并使用mask将相应的值设置为0

df1['myColD'] = df1.myColA.mask(df1.myColA.mul(df1.myColB).lt(0), 0)

print(df1)

   ID  myColA  myColB  myColC  myColD
0  A       1       5       9       1
1  B      -2       6      10       0
2  C       3      -7      11       0
3  D       4       8      12       4
​

point to the columns like this指向这样的列

import pandas as pd
import numpy as np
df1 = pd.DataFrame({'ID': ['A', 'B', 'C', 'D'],
    'myColA': [1, -2, 3, 4],
    'myColB': [5, 6, -7, 8],
    'myColC': [9, 10, 11, 12]},
     index=[0, 1, 2, 3])

df1['myColD'] = np.where(((df1['myColA'] > 0) & (df1['myColB'] < 0)) | ((df1['myColA'] < 0) & (df1['myColB'] > 0)), df1['myColA'], 0)
print(df1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM