熊猫中数据框的行式并集

Question

Say that I have two DataFrames 's: 假设我有两个DataFrames ：

df1 = pd.DataFrame([('A', 0.3), ('B', 0.4)], columns = ('ID', 'Buy'))
df2 = pd.DataFrame([('B', 3), ('A', 4)], columns = ('ID', 'Sell'))

That yield: 产量：

    ID  Buy
0   A   0.3
1   B   0.4

and 和

    ID  Sell
0   B   3
1   A   4

respectively. 分别。

Now, I want to obtain a single DataFrame that collects the data, namely: 现在，我想获得一个收集数据的DataFrame ，即：

    ID  Buy  Sell
0   A   0.3  4
1   B   0.4  3

Note that the order of the lines in df1 and df2 may not be the same. 请注意， df1和df2中的行顺序可能不同。 Furthermore, there might ID's that appear only in one frame and not in the other --- in this case the missing value should be filled with NaN I guess. 此外，可能会有ID仅出现在一个帧中而不出现在另一帧中-在这种情况下，我猜想缺失的值应该用NaN填充。

How can I do it? 我该怎么做？

I tried something like 我尝试了类似的东西

pd.concat([df1, df2], join = 'outer', axis = 1)

but doesn't return the desired result. 但不会返回期望的结果。

Answer 1

I think you want to merge on ID column: 我认为您想在ID列上进行merge ：

In [12]:

df1 = pd.DataFrame([('A', 0.3), ('B', 0.4)], columns = ('ID', 'Buy'))
df2 = pd.DataFrame([('B', 3), ('A', 4)], columns = ('ID', 'Sell'))
df1.merge(df2, on='ID', how='outer')
Out[12]:
  ID  Buy  Sell
0  A  0.3     4
1  B  0.4     3

熊猫中数据框的行式并集

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-12-08 10:25:56

熊猫中数据框的行式并集

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-12-08 10:25:56

解决方案1
1 已采纳 2014-12-08 10:25:56