[英]Selecting common elements in dataframes Python
I have 3 DFs, and I want to find all cells with common directions (ie always positive or always negative, across DFs): 我有3个DF,我想找到所有具有相同方向的单元(即,在DF上始终为正或始终为负):
test=pd.DataFrame([[0,1,0,3],
[-1,3,0,2],
[2,1.5,-3,1]],
columns=['a','b','c','d']
)
test2=pd.DataFrame([[1,1,0,2],
[1,-3,0,1],
[2,1.5,-2,1]],
columns=['a','b','c','d']
)
test3=pd.DataFrame([[1,2,0,2],
[1,-2,0,1],
[2,1.5,-2,1]],
columns=['a','b','c','d']
)
The outcome should be the 3 dataframes, where elements that are not consistent show NAs. 结果应为3个数据框,其中不一致的元素显示NA。 For instance, for
test1
it would be: 例如,对于
test1
,它将是:
test=pd.DataFrame([[NA,1,NA,3],
[NA,NA,NA,2],
[2,1.5,-3,1]],
columns=['a','b','c','d']
)
Note that 0 is not considered (ie, leads to NA). 注意,不考虑0(即导致NA)。 I can do this cell by cell, but I'm wondering if this is possible to do in the entire dataframes at once?
我可以逐个单元地执行此操作,但是我想知道是否可以一次在整个数据帧中执行此操作?
I was trying to do ((test>0)&(test1>0)&(test2>0))
and this works, but I cannot merge this with the negatives. 我试图做
((test>0)&(test1>0)&(test2>0))
,但是可以,但是我不能将它与底片合并。
Thanks so much in advance 提前非常感谢
A slightly different approach - you could stack the underlying arrays together, use np.sign
, then sum and reduce over the added dimension to generate a mask for df.where
. 一种略有不同的方法-您可以将基础数组堆叠在一起,使用
np.sign
,然后求和并求和并np.sign
添加的维,以生成df.where
的掩码。
In [58]: m, n = test.shape
In [59]: signs = np.sign(np.dstack((test, test2, test3)))
In [60]: mask = np.abs(np.sum(signs, -1)) == m
In [61]: test.where(mask)
Out[61]:
a b c d
0 NaN 1.0 NaN 3
1 NaN NaN NaN 2
2 2.0 1.5 -3.0 1
You can use np.sign
and addition with a equality test then where
to do this: 您可以在相等性测试中使用
np.sign
和加法,然后where
执行此操作:
test.where(np.sign(test).add(np.sign(test2)).add(np.sign(test3)).abs() == 3)
Output: 输出:
a b c d
0 NaN 1.0 NaN 3
1 NaN NaN NaN 2
2 2.0 1.5 -3.0 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.