I tried to find the answer in the official Pandas documentation , but found it more confusing than helpful. Basically I have two dataframes with overlapping, but not identical column lists:
df1:
A B
0 22 34
1 78 42
df2:
B C
0 76 29
1 11 67
I want to merge/concatenate/append them so that the result is
df3:
A B C
0 22 34 nan
1 78 42 nan
2 nan 76 29
3 nan 11 67
Should be fairly simple, but I've tried several intuitive approaches and always got errors. Can anybody help me?
You need merge with parameter how = outer
df3 = df1.merge(df2, how = 'outer')
A B C
0 22.0 34 NaN
1 78.0 42 NaN
2 NaN 76 29.0
3 NaN 11 67.0
If you just want to concatenate the dataframes you can use.
pd.concat([df1,df2])
output:
A B C
0 22.0 34 NaN
1 78.0 42 NaN
0 NaN 76 11.0
1 NaN 11 67.0
Then you can reset_index to recreate a simple incrementing index.
pd.concat([df,df2]).reset_index(drop = True)
Output:
A B C
0 22.0 34 NaN
1 78.0 42 NaN
2 NaN 76 11.0
3 NaN 11 67.0
Both @vaishali and @scott-boston solution work. Prefer the merge function as it allow more flexibility on your result with the how parameter. Howerver concat can achieve better performance if few columns are involved
To optimize @scott-boston answer, you can also use the internal concat parameter igonore_index
that automatically resize the index without calling another function the code would be like :
pd.concat([df1,df2],ignore_index=True)
Output
A B C
0 22.0 34 NaN
1 78.0 42 NaN
2 NaN 76 11.0
3 NaN 11 67.0
Python (version 3.8.5) | pandas(version 1.1.3)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.