[英]Add column from df1 into df2 based on the Date column and fill na if df1 [Date] entries are missing
I have dataframe df1
with 850 rows and column names ['Date', 'A']
.我有
df1
具有 850 行和列名['Date', 'A']
数据框。 I also have df2
with 900 rows and column names ['Date', 'B', 'C', 'D']
.我也有
df2
900 行和列名['Date', 'B', 'C', 'D']
。
The difference in their number of rows is because df1
has some missing 'Date' entries.它们的行数不同是因为
df1
缺少一些“日期”条目。 But, all entries in df1['Date'] are in df2['Date'].但是,df1['Date'] 中的所有条目都在 df2['Date'] 中。
Question: I would like to merge df1['A']
to df2
on basis of same ['Date']
rows.问题:我想基于相同的
['Date']
行将df1['A']
合并到df2
。 After merging, I would like the resultant df2['A']
to reflect a 'na' for all those rows whose ['Dates']
are missing in df1
.合并后,我希望生成的
df2['A']
为df1
中缺少['Dates']
的所有行反映一个 'na' 。
I tried df2=pd.merge(df2, df1, on="Date")
but I get resultant df2
to have 850 rows which seems that the dates which don't match between df1 and df2 are being deleted.我尝试
df2=pd.merge(df2, df1, on="Date")
但我得到的结果df2
有 850 行,这似乎是删除了 df1 和 df2 之间不匹配的日期。 Instead, I would want the post-merged resultant df2
to be 900 rows and the unmatched date rows should show 'na' in df2['A']`.相反,我希望合并后的结果
df2
为 900 行,并且不匹配的日期行应在 df2['A']` 中显示“na”。
How to achieve this?如何做到这一点?
Use left
join instead of inner
join (default behavior)使用
left
连接而不是inner
连接(默认行为)
ie, IE,
new_df = pd.merge(df2, df1, on="Date", how='left')
To fill NA
(as asked by OP in comments) with zero,用零填充
NA
(如 OP 在评论中要求的那样),
new_df.fillna(0, inplace=True)
# new_df['column'] = new_df['column'].astype(np.float64) # to convert column to float
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.