Pandas - Merge two df's on non-unique date (outer join)

Question

I have two df's that I would like to combine in a slightly unusual way.

The df's in question:

df1:
Index      colA 
2012-01-02  1
2012-01-05  2
2012-01-10  3
2012-01-10  4

and then df2:

Index      colB
2012-01-01  6
2012-01-05  7
2012-01-08  8
2012-01-10  9

Output:

Index      colA colB
2012-01-01  NaN   6
2012-01-02  1     NaN
2012-01-05  2     7
2012-01-08  NaN   8
2012-01-10  3     9
2012-01-10  4     NaN

Happy to have the NaN output if there is no matching date between the df's.
If there is a matching date I would like to return both columns.
There can be an instance where a single date has eg. 20 rows in df1 and 15 rows in df2.. it would match off the first 15 (don't care about ordering) and then return NaN's for the last 5 rows in df2.

When trying to do this myself with pd.merge() and others I can't because the date is obviously not unique for an index.

Any suggestions how to get the intended behavior?

Thanks

Answer 1

You may need create a helper key with cumcount

df1=df1.assign(key=df1.groupby('Index').cumcount())
df2=df2.assign(key=df2.groupby('Index').cumcount())
fdf=df1.merge(df2,how='outer').drop('key',1).sort_values('Index')
fdf
Out[104]: 
        Index  colA  colB
4  2012-01-01   NaN   6.0
0  2012-01-02   1.0   NaN
1  2012-01-05   2.0   7.0
5  2012-01-08   NaN   8.0
2  2012-01-10   3.0   9.0
3  2012-01-10   4.0   NaN

Answer 2

使用join()应该可以

df1.join(df2, how='outer', sort=True)

Pandas - Merge two df's on non-unique date (outer join)

Question

2 answers

solution1
3 ACCPTED 2019-02-25 23:30:18

solution2
0 2019-02-26 00:00:05

Pandas - Merge two df's on non-unique date (outer join)

Question

2 answers

solution1 3 ACCPTED 2019-02-25 23:30:18

solution2 0 2019-02-26 00:00:05

solution1
3 ACCPTED 2019-02-25 23:30:18

solution2
0 2019-02-26 00:00:05