Add column from df1 into df2 based on the Date column and fill na if df1 [Date] entries are missing

Question

I have dataframe df1 with 850 rows and column names ['Date', 'A'] . I also have df2 with 900 rows and column names ['Date', 'B', 'C', 'D'] .

The difference in their number of rows is because df1 has some missing 'Date' entries. But, all entries in df1['Date'] are in df2['Date'].

Question: I would like to merge df1['A'] to df2 on basis of same ['Date'] rows. After merging, I would like the resultant df2['A'] to reflect a 'na' for all those rows whose ['Dates'] are missing in df1 .

I tried df2=pd.merge(df2, df1, on="Date") but I get resultant df2 to have 850 rows which seems that the dates which don't match between df1 and df2 are being deleted. Instead, I would want the post-merged resultant df2 to be 900 rows and the unmatched date rows should show 'na' in df2['A']`.

How to achieve this?

Answer 1

Use left join instead of inner join (default behavior)

ie,

new_df = pd.merge(df2, df1, on="Date", how='left')

To fill NA (as asked by OP in comments) with zero,

new_df.fillna(0, inplace=True)
# new_df['column'] = new_df['column'].astype(np.float64) # to convert column to float

Add column from df1 into df2 based on the Date column and fill na if df1 [Date] entries are missing

Question

1 answers

solution1
1 ACCPTED 2022-05-19 06:10:11

Add column from df1 into df2 based on the Date column and fill na if df1 [Date] entries are missing

Question

1 answers

solution1 1 ACCPTED 2022-05-19 06:10:11

solution1
1 ACCPTED 2022-05-19 06:10:11