Python pandas：合并部分重叠的数据帧

Question

我有一个数据df1 ，例如：

name | group | col1 | col2 | col3 | col4 | col5
id1  | G1    |
id2  | G1    |
id3  | G1    |
id4  | G2    |
id5  | G2    |
id6  | G2    |
...
id10

col1 、 col2等的值是浮点数，>= 0。
name值是字符串，其中每个名称唯一标识每一行
group值是字符串。 此列描述了一组名称，并包含在内是为了完整性。

另一个数据df2 ，例如：

name | group | col2 | col4 | col5 | col7 |
id11 | G1    |
id12 | G1    |
id13 | G1    |
id14 | G2    |
id15 | G2    |
id16 | G2    |
...
id20

df1和df2之间没有通用name值。
df2.group也包含值G1或G2
df2的列可以是df1的一部分（例如col2 、 col4和col5 ），也可以是df2唯一的（例如col7 ）。

我希望像这样合并这两个数据框：

name | group | col1 | col2 | col3 | col4 | col5 | col7
id1  | G1    |                                  |  0
id2  | G1    |                                  |  0
id3  | G1    |                                  |  0
id4  | G2    |                                  |  0
...
id10 | G2    |  0   |      |  0   |      |      |
id11 | G1    |  0   |      |  0   |      |      |
id12 | G1    |  0   |      |  0   |      |      |
...
id20

将df2的行附加到df1 ，并获取其列的集合并集。
如果原始数据帧中的一行在新列下没有值，则合并数据帧中的值将为零。 例如， df1没有col7 ，因此在合并的数据col7 ，源自df1所有行都将在col7下获得值 0 。 对于源自df2所有行以及列col1和col3都是相同的，这些列对于df1是唯一的。

Answer 1

结果比我想象的要容易得多：

df_union_all= pd.concat([df1, df2])

Python pandas：合并部分重叠的数据帧

问题描述

1 个解决方案

解决方案1
0 2019-06-26 05:51:06

Python pandas：合并部分重叠的数据帧

问题描述

1 个解决方案

解决方案1 0 2019-06-26 05:51:06

解决方案1
0 2019-06-26 05:51:06