简体   繁体   English

用其他形状填充另一个Pandas数据框中的缺失值

[英]Fill missing values from another Pandas dataframe with different shape

I have 2 tables of different sizes which I would like to merge in the following way in Python using Pandas: 我有2个不同大小的表,我想使用Pandas在Python中以以下方式合并:

UID Property    Date
1   A           10/02/2016
2   B           NaN
3   A           10/02/2016
4   C           NaN
5   C           NaN
6   A           10/02/2016

Table 1 contains information about Property transactions and a Date related to the Property. 表1包含有关财产交易的信息以及与财产有关的日期。 As some of the dates are NaNs, I would like to proxy them from another table (Table 2) containing information solely about properties, but not replacing any dates in Table 1: 由于某些日期是NaN,因此我想从另一个仅包含属性信息的表(表2)中代理它们,而不替换表1中的任何日期:

Property    DateProxy
A           01/01/2016
B           03/04/2016
C           16/05/2016

In the end I would like to obtain the following: 最后,我想获得以下内容:

UID Property    Date
1   A           10/02/2016 (kept from T1)
2   B           03/04/2016 (imported from T2)
3   A           10/02/2016 (kept from T1)
4   C           16/05/2016 (imported from T2)
5   C           16/05/2016 (imported from T2)
6   A           10/02/2016 (kept from T1)

First let's merge the two datasets: we don't overwrite the original date: 首先,让我们合并两个数据集:我们不会覆盖原始日期:

df_merge = pandas.merge(T1, T2, on='Property')

then we replace the missing values copying them from the 'DateProxy' field: 然后我们替换缺失的值,并从“ DateProxy”字段中复制它们:

df_merge.Date = df_merge.apply(
    lambda x: x['Date'] + ' (kept from T1)' if x['Date'] == x['Date']
                                            else x['DateProxy'] + ' (imported from T2)',
    axis=1
)

(the x['Date'] == x['Date'] is to check that it isn't NaN, NaN is not equal to itself). (x ['Date'] == x ['Date']用于检查它不是NaN,NaN不等于其自身)。 Finally we can drop the proxy column: 最后,我们可以删除proxy列:

df_final = df_merge.drop('DateProxy', axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM