[英]Pandas - Compare positive/negative values
I have a dataframe "df": 我有一个数据框“ df”:
x y
0 1 -1
1 -2 -3
2 3 4
3 4 5
4 9 6
I am trying to determine what percentage of x and y values are in agreement in terms of being positive or negative. 我试图确定x值和y值在正数或负数方面达成一致的百分比。 So if x is positive and y is positive, that would be a correct answer.
因此,如果x为正而y为正,那将是一个正确的答案。 If x and y are both negative, that would be correct.
如果x和y均为负,那将是正确的。 If x and y are different, then it is wrong.
如果x和y不同,那么这是错误的。 Is there a fast way to do this?
有快速的方法吗? Ultimately I just want to know what percentage of all rows have a correct answer.
最终,我只想知道所有行中有正确回答的百分比。
(ps there are 1M+ rows in the actual dataframe) (ps实际数据帧中有1M +行)
Thank You 谢谢
If we compare a product of x*y >= 0
- this should give us "good"
rows: 如果我们比较
x*y >= 0
的乘积-这应该给我们"good"
行:
In [19]: df['x'].mul(df['y']).ge(0)
Out[19]:
0 False
1 True
2 True
3 True
4 True
dtype: bool
In [20]: df.loc[df['x'].mul(df['y']).ge(0)]
Out[20]:
x y
1 -2 -3
2 3 4
3 4 5
4 9 6
In [21]: len(df.loc[df['x'].mul(df['y']).ge(0)])/len(df)
Out[21]: 0.8
or as proposed by @NickilMaveli a faster and more "Pandaic" version: 或由@NickilMaveli提出的更快,更“泛泛”的版本:
In [23]: df['x'].mul(df['y']).ge(0).mean()
Out[23]: 0.80000000000000004
the same idea, but this time using df.eval() method: 相同的想法,但是这次使用df.eval()方法:
In [27]: df.eval('x * y >= 0').mean()
Out[27]: 0.80000000000000004
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.