在python pandas数据框中为每一行查找两组列的交集，而不进行循环

Question

I have a following pandas.DataFrame : 我有以下pandas.DataFrame ：

df = pd.DataFrame({'A1':['a','a','d'], 'A2':['b','c','c'], 
                   'B1':['d','a','c'], 'B2': ['e','d','e']})
  A1 A2 B1 B2
0  a  b  d  e
1  a  c  a  d
2  d  c  c  e

I would like to choose the rows in which values in A1 and A2 are different from B1 and B2 , or intersection of values in ['A1', 'A2'] and ['B1', 'B2'] is empty, so in the above example only the row 0 should be chosen. 我想选择A1和A2中的值与B1和B2不同的行，或者['A1', 'A2']和['B1', 'B2']值的交集是空的，所以在上面的例子只应该选择第0行。

So far the best I could do is to loop over every row of my data frame with the following code 到目前为止，我能做的最好的事情是使用以下代码遍历我的数据帧的每一行

for i in df.index.values:
   if df.loc[i,['A1','A2']].isin(df.loc[i,['B1','B2']]).sum()>0:
       df = df.drop(i,0)

Is there a way to do this without looping? 有没有办法在没有循环的情况下做到这一点？

Answer 1

You can test for that directly like: 你可以直接测试它：

Code: 码：

df[(df.A1 != df.B1) & (df.A2 != df.B2) & (df.A1 != df.B2) & (df.A2 != df.B1)]

Test Code: 测试代码：

df = pd.DataFrame({'A1': ['a', 'a', 'd'], 'A2': ['b', 'c', 'c'],
                   'B1': ['d', 'a', 'c'], 'B2': ['e', 'd', 'e']})

print(df)
print(df[(df.A1 != df.B1) & (df.A2 != df.B2) & 
         (df.A1 != df.B2) & (df.A2 != df.B1)])

Results: 结果：

  A1 A2 B1 B2
0  a  b  d  e
1  a  c  a  d
2  d  c  c  e

  A1 A2 B1 B2
0  a  b  d  e

Answer 2

By using intersection 通过使用交集

df['Key1']=df[['A1','A2']].values.tolist() 
df['Key2']=df[['B1','B2']].values.tolist() 


df.apply(lambda x : len(set(x['Key1']).intersection(x['Key2']))==0,axis=1)
Out[517]: 
0     True
1    False
2    False
dtype: bool


df[df.apply(lambda x : len(set(x['Key1']).intersection(x['Key2']))==0,axis=1)].drop(['Key1','Key2'],1)
Out[518]: 
  A1 A2 B1 B2
0  a  b  d  e

Answer 3

In today's edition of 在今天的版本中
Way More Complicated Than It Needs To Be 方式比需要的更复杂

Chapter 1 第1章
We bring you map , generators, and set logic 我们为您带来map ，生成器和set逻辑

mask = list(map(lambda x: not bool(x),
         (set.intersection(*map(set, pair))
          for pair in df.values.reshape(-1, 2, 2).tolist())
        ))

df[mask]

  A1 A2 B1 B2
0  a  b  d  e

Chapter 2 第2章
Numpy broadcasting Numpy广播

v = df.values
df[(v[:, :2, None] != v[:, None, 2:]).all((1, 2))]

  A1 A2 B1 B2
0  a  b  d  e

在python pandas数据框中为每一行查找两组列的交集，而不进行循环

问题描述

3 个解决方案

解决方案1
4 2018-02-28 04:20:42

Code: 码：

Test Code: 测试代码：

Results: 结果：

解决方案2
2 2018-02-28 04:36:52

解决方案3
1 2018-02-28 05:43:44

在python pandas数据框中为每一行查找两组列的交集，而不进行循环

问题描述

3 个解决方案

解决方案1 4 2018-02-28 04:20:42

Code: 码：

Test Code: 测试代码：

Results: 结果：

解决方案2 2 2018-02-28 04:36:52

解决方案3 1 2018-02-28 05:43:44

解决方案1
4 2018-02-28 04:20:42

解决方案2
2 2018-02-28 04:36:52

解决方案3
1 2018-02-28 05:43:44