![](/img/trans.png)
[英]How to remove rows from Pandas dataframe if the same row exists in another dataframe but end up with all columns from both df
[英]How to remove rows in a Pandas dataframe if the same row exists in another dataframe?
我有两个数据帧:
df1 = row1;row2;row3
df2 = row4;row5;row6;row2
我希望我的输出数据帧只包含df1中唯一的行,即:
df_out = row1;row3
我如何最有效地获得这个?
这段代码做我想要的,但使用2个for循环:
a = pd.DataFrame({0:[1,2,3],1:[10,20,30]})
b = pd.DataFrame({0:[0,1,2,3],1:[0,1,20,3]})
match_ident = []
for i in range(0,len(a)):
found=False
for j in range(0,len(b)):
if a[0][i]==b[0][j]:
if a[1][i]==b[1][j]:
found=True
match_ident.append(not(found))
a = a[match_ident]
您可以将a
和b
转换为Index
s,然后使用Index.isin
方法确定共享哪些行:
import pandas as pd
a = pd.DataFrame({0:[1,2,3],1:[10,20,30]})
b = pd.DataFrame({0:[0,1,2,3],1:[0,1,20,3]})
a_index = a.set_index([0,1]).index
b_index = b.set_index([0,1]).index
mask = ~a_index.isin(b_index)
result = a.loc[mask]
print(result)
产量
0 1
0 1 10
2 3 30
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.