I have dataframe df1
with 850 rows and column names ['Date', 'A']
. I also have df2
with 900 rows and column names ['Date', 'B', 'C', 'D']
.
The difference in their number of rows is because df1
has some missing 'Date' entries. But, all entries in df1['Date'] are in df2['Date'].
Question: I would like to merge df1['A']
to df2
on basis of same ['Date']
rows. After merging, I would like the resultant df2['A']
to reflect a 'na' for all those rows whose ['Dates']
are missing in df1
.
I tried df2=pd.merge(df2, df1, on="Date")
but I get resultant df2
to have 850 rows which seems that the dates which don't match between df1 and df2 are being deleted. Instead, I would want the post-merged resultant df2
to be 900 rows and the unmatched date rows should show 'na' in df2['A']`.
How to achieve this?
Use left
join instead of inner
join (default behavior)
ie,
new_df = pd.merge(df2, df1, on="Date", how='left')
To fill NA
(as asked by OP in comments) with zero,
new_df.fillna(0, inplace=True)
# new_df['column'] = new_df['column'].astype(np.float64) # to convert column to float
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.