[英]Fill missing values from another Pandas dataframe with different shape
I have 2 tables of different sizes which I would like to merge in the following way in Python using Pandas: 我有2个不同大小的表,我想使用Pandas在Python中以以下方式合并:
UID Property Date
1 A 10/02/2016
2 B NaN
3 A 10/02/2016
4 C NaN
5 C NaN
6 A 10/02/2016
Table 1 contains information about Property transactions and a Date related to the Property. 表1包含有关财产交易的信息以及与财产有关的日期。 As some of the dates are NaNs, I would like to proxy them from another table (Table 2) containing information solely about properties, but not replacing any dates in Table 1: 由于某些日期是NaN,因此我想从另一个仅包含属性信息的表(表2)中代理它们,而不替换表1中的任何日期:
Property DateProxy
A 01/01/2016
B 03/04/2016
C 16/05/2016
In the end I would like to obtain the following: 最后,我想获得以下内容:
UID Property Date
1 A 10/02/2016 (kept from T1)
2 B 03/04/2016 (imported from T2)
3 A 10/02/2016 (kept from T1)
4 C 16/05/2016 (imported from T2)
5 C 16/05/2016 (imported from T2)
6 A 10/02/2016 (kept from T1)
First let's merge the two datasets: we don't overwrite the original date: 首先,让我们合并两个数据集:我们不会覆盖原始日期:
df_merge = pandas.merge(T1, T2, on='Property')
then we replace the missing values copying them from the 'DateProxy' field: 然后我们替换缺失的值,并从“ DateProxy”字段中复制它们:
df_merge.Date = df_merge.apply(
lambda x: x['Date'] + ' (kept from T1)' if x['Date'] == x['Date']
else x['DateProxy'] + ' (imported from T2)',
axis=1
)
(the x['Date'] == x['Date'] is to check that it isn't NaN, NaN is not equal to itself). (x ['Date'] == x ['Date']用于检查它不是NaN,NaN不等于其自身)。 Finally we can drop the proxy column: 最后,我们可以删除proxy列:
df_final = df_merge.drop('DateProxy', axis=1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.