简体   繁体   中英

Fill nan with values from another df based on a condition

I have a df that looks like this

df1:

    Quantity     Date      Open
0       NaN    2006-01-16   NaN
1     -20.0    2006-01-17   NaN
2     -20.0    2006-01-18   NaN
3       NaN    2006-01-19   NaN
4      20.0    2006-01-20   NaN
.        .         .         .
.        .         .         .
.        .         .         .

and another dataframe that looks like this

df2
          Date       Open     Quantity
0    2006-01-16     4567.00     -20.0
1    2006-01-19     4506.00      20.0
2    2006-01-25     4495.05     -20.0
3    2006-01-27     4609.80      20.0
4    2006-02-01     4574.05     -20.0   

What I want to do is merge df1 and df2 on ['Quantity','Open'] columns given that it only merge on rows where df1.Quantity is NaN. Therefore, df1 should look like this

df1:

    Quantity     Date      Open
0     -20.0    2006-01-16   4567.00
1     -20.0    2006-01-17   NaN
2     -20.0    2006-01-18   NaN
3      20.0    2006-01-19   4506.00
4      20.0    2006-01-20   NaN

what I tried is this code df1.Open = df1.loc[df1['Quantity'].isna(), 'Open'].fillna(df2.EntryPrice) . I tried this because im sure that the dates in df2 is included in the dates in df1 and has NaN values in df1.Quantity.However when I ran this, this is the result

      Quantity       Date    Open
0          -20 2006-01-16  4567.0
1        -20.0 2006-01-17     NaN
2        -20.0 2006-01-18     NaN
3           20 2006-01-19  4609.8
4         20.0 2006-01-20     NaN
...        ...        ...     ...
3317     -20.0 2017-05-23     NaN
3318       NaN 2017-05-23     NaN
3319      20.0 2017-05-24     NaN
3320      20.0 2017-05-25     NaN
3321      20.0 2017-05-26     NaN

As you can see, at row 3318, the NaN values in Quantity and Open columns are still unfilled. Can someone help me

Create DatetimeIndex in both DataFrame s and then replace missing values in Open only for filtered rows and then Quantity all missing rows:

df1 = df1.set_index('Date')
df2 = df2.set_index('Date')
mask = df1['Quantity'].isna()

df1.Open = df1.loc[mask, 'Open'].fillna(df2.Open)
df1.Quantity = df1['Quantity'].fillna(df2.Quantity)
df1 = df1.reset_index()
print (df1)
         Date  Quantity    Open
0  2006-01-16     -20.0  4567.0
1  2006-01-17     -20.0     NaN
2  2006-01-18     -20.0     NaN
3  2006-01-19      20.0  4506.0
4  2006-01-20      20.0     NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM