简体   繁体   中英

Fill values in another dataframe based on data from first one

I have dataframe like this:

ID  2018-01    2018-02   2018-03   2018-04
A1  8500        8500        8500      8500      
A2   NA         1900        1900      1900       
A3   NA          NA          NA       3000      
A4   NA          NA          NA          0       

Now I have other dataframe that I want to use to fill NA values with

ID   Date    Due  
A1   2018-01  8500
A2   2018-01  9000
A3   2018-02  4000
A4   2018-01  1000

Now from the date in this dataframe (month) to the next value that is not na in the first dataframe I want to fill with the value from Due column: So result is this:

ID  2018-01    2018-02   2018-03   2018-04
A1  8500        8500        8500      8500      
A2  9000        1900        1900      1900       
A3   NA         4000        4000      3000      
A4  1000        1000        1000         0   

How could I do that?

EDIT: There is a case when there are no prepopulated values in the row at all

    ID  2018-01    2018-02   2018-03   2018-04
    A1  8500        8500        8500      8500      
    A2   NA         1900        1900      1900       
    A3   NA          NA          NA       3000      
    A4   NA          NA          NA          0   
    A5   NA          NA          NA         NA


ID   Date    Due  
A1   2018-01  8500
A2   2018-01  9000
A3   2018-02  4000
A4   2018-01  1000
A5   2018-03  1500

In such a case is it possible to only put corresponding value in on column according to the date without filling it all the way ?

So the result:

 ID  2018-01    2018-02   2018-03   2018-04
A1  8500        8500        8500      8500      
A2  9000        1900        1900      1900       
A3   NA         4000        4000      3000      
A4  1000        1000        1000         0  
A5   NA          NA          1500       NA

If ID is column in df1 use DataFrame.pivot , then forward filling missin values, last replace missing values by DataFrame.fillna or DataFrame.combine_first :

df = df1.set_index('ID').fillna(df2.pivot('ID','Date','Due').ffill(axis=1))
print (df)
    2018-01  2018-02  2018-03  2018-04
ID                                    
A1   8500.0   8500.0   8500.0   8500.0
A2   9000.0   1900.0   1900.0   1900.0
A3      NaN   4000.0   4000.0   3000.0
A4   1000.0   1000.0   1000.0      0.0
A5      NaN      NaN   1500.0      NaN

Using pd.crosstab and DataFrame.update :

Since you want to update NaN values from one dataframe in another, we can use DataFrame.update for this, but first we set the right axis, since this method aligns on these:

df1 = df1.set_index('ID')
df1.update(pd.crosstab(df2['ID'], df2['Date'], df2['Due'], aggfunc='sum'))
df1 = df1.ffill(axis=1)

    2018-01  2018-02  2018-03  2018-04
ID                                    
A1   8500.0   8500.0   8500.0   8500.0
A2   9000.0   1900.0   1900.0   1900.0
A3      NaN   4000.0   4000.0   3000.0
A4   1000.0   1000.0   1000.0      0.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM