简体   繁体   English

找到合并(外部 - 内部)熊猫DF差异

[英]Finding Merge (Outer - Inner) Pandas DF differences

Want to find the difference between two outer-merge and inner-merge DataFrames, without finding any row with NaN -- I want to keep some rows with them. 想要找到两个外合并和内合并DataFrame之间的区别,而没有找到任何含NaN行 - 我想用它们保留一些行。 Is there a way to do this using the difference method or preferably without having to create both FrameA and FrameB ? 有没有办法使用difference方法或最好不必同时创建FrameAFrameB

import pandas as pd

DataA = pd.DataFrame([{"a": 1, "b": 4}, {"a": 6, "b": 2}, {"a": 2, "b": 5}, {"a": 3, "b": 6}, {"a": 7, "b": 2}])
DataB = pd.DataFrame([{"a": 2, "d": 7}, {"a": 7, "d": 8}, {"a": 3, "d": 8}])

DataA DataA的

    a   b
0   1   4
1   6   2
2   2   5
3   3   6
4   7   2

DataB 数据B

    a   d
0   2   7
1   7   8
2   3   8

... ...

FrameA = pd.merge(DataA, DataB, on = "a", how ='inner')
FrameB = pd.merge(DataA, DataB, on = "a", how ='outer')

FrameA FrameA

    a   b   d
0   2   5   7
1   3   6   8
2   7   2   8

FrameB FrameB

    a   b   d
0   1   4   NaN
1   6   2   NaN
2   2   5   7
3   3   6   8
4   7   2   8

Trying to find DataFrame differences... 试图找到DataFrame的差异......

list(FrameB.index.difference(FrameA.index))

Maybe you have a better solution, with this desired output: 也许你有一个更好的解决方案,这个期望的输出:

    a   b   d
0   1   4   NaN
1   6   2   NaN

You are looking for the symmetric_difference : 您正在寻找symmetric_difference

a = DataA.set_index('a')
b = DataB.set_index('a')

# select rows from the outer join using the symmetric difference (^)
a.join(b, how='outer').loc[a.index ^ b.index].reset_index()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM