I would like to merge two Dataframes into 1 fast as possible.
DF1:
A B C
0 A0 B0 C0
1 A1 B1 C1
2 A2 B2 C2
3 A3 B3 C3
4 A4 B4 C4
5 A5 B5 C5
6 A6 B6 C6
7 A7 B7 C7
</pre>
DF2:
A Z 0 A3 Z4 1 A5 Z5 2 A6 Z6 3 A7 Z7
Current Results:
A B C Z 0 A0 B0 C0 NaN 1 A1 B1 C1 NaN 2 A2 B2 C2 NaN 3 A3 B3 C3 [Z4] 4 A4 B4 C4 [Z4] 5 A5 B5 C5 [Z5] 6 A6 B6 C6 [Z6] 7 A7 B7 C7 [Z7]
Required Results:
A B C Z 0 A0 B0 C0 NaN 1 A1 B1 C1 NaN 2 A2 B2 C2 NaN 3 A3 B3 C3 Z4 4 A4 B4 C4 NaN 5 A5 B5 C5 Z5 6 A6 B6 C6 Z6 7 A7 B7 C7 Z7
Three problems I have with my code. 1) I would not like the Z4 to repeat. if its null I would just like NaN.
2) I am not sure why but is coming as list when it should be string format.
3) very slow.NOTE: Everything that is in DF2 dataframe column A will always be in D2 column A
My current Code
```
import pandas as pd
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5', 'A6', 'A7'],
'B': ['B0', 'B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B7'],
'C': ['C0', 'C1', 'C2', 'C3','C4', 'C5', 'C6', 'C7']},
index=[0, 1, 2, 3, 4, 5, 6, 7])
df2 = pd.DataFrame({'A': ['A3', 'A5', 'A6', 'A7'],
'Z': ['Z4', 'Z5', 'Z6', 'Z7'],
},
index=[0, 1, 2, 3])
def mergeDF(df1, df2):
import pandas as pd
v = df1.merge(df2[['A', 'Z']])\
.groupby(df1.columns.tolist())\
.Z\
.apply(pd.Series.tolist)
df = pd.DataFrame(v.tolist(), index=v.index)\
.rename(columns=lambda x: x + 1)\
.add_prefix('Z')\
.reset_index()
df = df1.merge(v, how='left', on='A').ffill()
print (df)
mergeDF(df1, df2)
print (df1)
```
You can do this in a much simpler way that what you've done now:
pd.merge(df1,df2,how='left',on='A')
You've got a list because you converted v to list in your function and added ffill
which will replace null values by the last non null value in the column
Quick one:
>>>df1.join(df2.set_index('A'), how='outer', on='A')`
A B C Z
0 A0 B0 C0 NaN
1 A1 B1 C1 NaN
2 A2 B2 C2 NaN
3 A3 B3 C3 Z4
4 A4 B4 C4 NaN
5 A5 B5 C5 Z5
6 A6 B6 C6 Z6
7 A7 B7 C7 Z7
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.