如果另一个数据帧中存在相同的行，如何删除Pandas数据帧中的行？

Question

我有两个数据帧：

 df1 = row1;row2;row3
 df2 = row4;row5;row6;row2

我希望我的输出数据帧只包含df1中唯一的行，即：

df_out = row1;row3

我如何最有效地获得这个？

这段代码做我想要的，但使用2个for循环：

a = pd.DataFrame({0:[1,2,3],1:[10,20,30]})
b = pd.DataFrame({0:[0,1,2,3],1:[0,1,20,3]})

match_ident = []
for i in range(0,len(a)):
    found=False
    for j in range(0,len(b)):
        if a[0][i]==b[0][j]:
            if a[1][i]==b[1][j]:
                found=True
    match_ident.append(not(found))

a = a[match_ident]

Answer 1

您可以使用参数indicator和外部联接进行merge ， query进行过滤，然后使用drop删除帮助列：

DataFrames被连接上的所有列，因此on参数可以被省略。

print (pd.merge(a,b, indicator=True, how='outer')
         .query('_merge=="left_only"')
         .drop('_merge', axis=1))
   0   1
0  1  10
2  3  30

Answer 2

您可以将a和b转换为Index s，然后使用Index.isin方法确定共享哪些行：

import pandas as pd
a = pd.DataFrame({0:[1,2,3],1:[10,20,30]})
b = pd.DataFrame({0:[0,1,2,3],1:[0,1,20,3]})

a_index = a.set_index([0,1]).index
b_index = b.set_index([0,1]).index
mask = ~a_index.isin(b_index)
result = a.loc[mask]
print(result)

产量

   0   1
0  1  10
2  3  30

如果另一个数据帧中存在相同的行，如何删除Pandas数据帧中的行？

问题描述

2 个解决方案

解决方案1
9 已采纳 2017-06-22 18:26:33

解决方案2
2 2017-06-22 18:33:22

如果另一个数据帧中存在相同的行，如何删除Pandas数据帧中的行？

问题描述

2 个解决方案

解决方案1 9 已采纳 2017-06-22 18:26:33

解决方案2 2 2017-06-22 18:33:22

解决方案1
9 已采纳 2017-06-22 18:26:33

解决方案2
2 2017-06-22 18:33:22