简体   繁体   English

根据第一个数据框中的数据填充另一个数据框中的值

[英]Fill values in another dataframe based on data from first one

I have dataframe like this:我有这样的数据框:

ID  2018-01    2018-02   2018-03   2018-04
A1  8500        8500        8500      8500      
A2   NA         1900        1900      1900       
A3   NA          NA          NA       3000      
A4   NA          NA          NA          0       

Now I have other dataframe that I want to use to fill NA values with现在我有其他数据框,我想用它来填充 NA 值

ID   Date    Due  
A1   2018-01  8500
A2   2018-01  9000
A3   2018-02  4000
A4   2018-01  1000

Now from the date in this dataframe (month) to the next value that is not na in the first dataframe I want to fill with the value from Due column: So result is this:现在从这个数据帧(月)中的日期到第一个数据帧中不是 na 的下一个值,我想用Due列中的值填充: 所以结果是这样的:

ID  2018-01    2018-02   2018-03   2018-04
A1  8500        8500        8500      8500      
A2  9000        1900        1900      1900       
A3   NA         4000        4000      3000      
A4  1000        1000        1000         0   

How could I do that?我怎么能那样做?

EDIT: There is a case when there are no prepopulated values in the row at all编辑:有一种情况,行中根本没有预填充值

    ID  2018-01    2018-02   2018-03   2018-04
    A1  8500        8500        8500      8500      
    A2   NA         1900        1900      1900       
    A3   NA          NA          NA       3000      
    A4   NA          NA          NA          0   
    A5   NA          NA          NA         NA


ID   Date    Due  
A1   2018-01  8500
A2   2018-01  9000
A3   2018-02  4000
A4   2018-01  1000
A5   2018-03  1500

In such a case is it possible to only put corresponding value in on column according to the date without filling it all the way ?在这种情况下,是否可以只根据日期将相应的值放入列中而不完全填写?

So the result:所以结果:

 ID  2018-01    2018-02   2018-03   2018-04
A1  8500        8500        8500      8500      
A2  9000        1900        1900      1900       
A3   NA         4000        4000      3000      
A4  1000        1000        1000         0  
A5   NA          NA          1500       NA

If ID is column in df1 use DataFrame.pivot , then forward filling missin values, last replace missing values by DataFrame.fillna or DataFrame.combine_first :如果IDdf1中的列,则使用DataFrame.pivot ,然后向前填充缺失值,最后用DataFrame.fillnaDataFrame.combine_first替换缺失值:

df = df1.set_index('ID').fillna(df2.pivot('ID','Date','Due').ffill(axis=1))
print (df)
    2018-01  2018-02  2018-03  2018-04
ID                                    
A1   8500.0   8500.0   8500.0   8500.0
A2   9000.0   1900.0   1900.0   1900.0
A3      NaN   4000.0   4000.0   3000.0
A4   1000.0   1000.0   1000.0      0.0
A5      NaN      NaN   1500.0      NaN

Using pd.crosstab and DataFrame.update :使用pd.crosstabDataFrame.update

Since you want to update NaN values from one dataframe in another, we can use DataFrame.update for this, but first we set the right axis, since this method aligns on these:由于您想从另一个数据帧中的一个数据帧更新NaN值,我们可以为此使用DataFrame.update ,但首先我们设置右轴,因为此方法与这些值对齐:

df1 = df1.set_index('ID')
df1.update(pd.crosstab(df2['ID'], df2['Date'], df2['Due'], aggfunc='sum'))
df1 = df1.ffill(axis=1)

    2018-01  2018-02  2018-03  2018-04
ID                                    
A1   8500.0   8500.0   8500.0   8500.0
A2   9000.0   1900.0   1900.0   1900.0
A3      NaN   4000.0   4000.0   3000.0
A4   1000.0   1000.0   1000.0      0.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM