[英]Pandas create boolean column based on whether 3 column values are all negative or all positive
I have the following data frame,我有以下数据框,
f1 f2 f3 f4 f5 f6 f7
0 -0.004446 59.763107 x0 0 60.029999 59.160000 -0.014493
1 -0.003414 113.212220 x1 0 113.599998 113.650002 0.000440
2 -0.013123 36.435513 x2 1 36.919998 36.889999 -0.000813
3 0.003558 68.854090 x3 0 68.420158 68.410179 -0.000146
4 -0.006840 23.021446 x4 0 23.180000 23.100000 -0.003451
... ... ... ... ... ... ... ...
145 0.000724 253.113110 x5 1 252.929993 247.210007 -0.022615
146 0.006567 128.236680 x6 0 127.400002 127.059998 -0.002669
147 -0.009016 610.079200 x7 1 615.630005 605.369995 -0.016666
148 -0.011290 165.173920 x8 0 167.059998 158.300003 -0.052436
149 0.021474 358.496370 x9 0 350.959991 343.329987 -0.021740
Basically, for column f4
, treat 0 as a negative or False and 1 as a positive or True.基本上,对于列
f4
,将 0 视为否定或 False,将 1 视为肯定或 True。
If values from columns f1
, f4
and f7
are all negative or all positive, the test column should return true for that row, else it should return false.如果
f1
、 f4
和f7
列中的值全部为负或全部为正,则测试列应为该行返回 true,否则应返回 false。
I want to create new column called 'test' that will say True
or False
based on my conditions.我想创建一个名为“测试”的新列,它会根据我的条件说
True
或“ False
”。 If in any cases, one of them don't match (either not all true or not all false), it will show False
.如果在任何情况下,其中一个不匹配(不是全部为真或不全部为假),它将显示
False
。
I can make the following code work with 2 columns,我可以使以下代码与 2 列一起使用,
df.loc[:,'test'] = df['f1'].ge(0).eq(df['f4'])
and it works fine.它工作正常。
However, if I try to chain it to add the f7
column like this,但是,如果我尝试链接它以像这样添加
f7
列,
df.loc[:,'test'] = df['f1'].ge(0).eq(df['f4']).eq(df['f7'].ge(0))
the results are wrong.结果是错误的。
I want the test column to look like this,我希望测试列看起来像这样,
f1 f2 f3 f4 f5 f6 f7 test
0 -0.004446 59.763107 x0 0 60.029999 59.160000 -0.014493 True
1 -0.003414 113.212220 x1 0 113.599998 113.650002 0.000440 False
2 -0.013123 36.435513 x2 1 36.919998 36.889999 -0.000813 False
3 0.003558 68.854090 x3 0 68.420158 68.410179 -0.000146 False
4 -0.006840 23.021446 x4 0 23.180000 23.100000 -0.003451 True
... ... ... ... ... ... ... ...
145 0.000724 253.113110 x5 1 252.929993 247.210007 -0.022615 False
146 0.006567 128.236680 x6 0 127.400002 127.059998 -0.002669 False
147 -0.009016 610.079200 x7 1 615.630005 605.369995 -0.016666 False
148 -0.011290 165.173920 x8 0 167.059998 158.300003 -0.052436 True
149 0.021474 358.496370 x9 0 350.959991 343.329987 -0.021740 False
How do I get the code to work the way I want it to?如何让代码按我想要的方式工作?
maybe because if f1 and f4 are negative, the first two comparisons will return positive, therefore chaining third will return false.可能是因为如果 f1 和 f4 为负,前两个比较将返回正,因此链接第三个将返回 false。
changing the code to将代码更改为
df['f1'].ge(0).eq(df['f4']).eq(df['f7'].ge(0).eq(df['f4']))
might work可能有用
also, I think this can be the general answer for negative XOR on n values;另外,我认为这可能是对 n 值负 XOR 的一般答案;
AND(all val) == OR(all_val)
hopefully this helped:)希望这有帮助:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.