[英]Fill nan with values from another df based on a condition
我有一个看起来像这样的df
df1:
Quantity Date Open
0 NaN 2006-01-16 NaN
1 -20.0 2006-01-17 NaN
2 -20.0 2006-01-18 NaN
3 NaN 2006-01-19 NaN
4 20.0 2006-01-20 NaN
. . . .
. . . .
. . . .
和另一个看起来像这样的 dataframe
df2
Date Open Quantity
0 2006-01-16 4567.00 -20.0
1 2006-01-19 4506.00 20.0
2 2006-01-25 4495.05 -20.0
3 2006-01-27 4609.80 20.0
4 2006-02-01 4574.05 -20.0
我想要做的是合并 ['Quantity','Open'] 列上的 df1 和 df2,因为它只合并 df1.Quantity 为 NaN 的行。 因此,df1 应该是这样的
df1:
Quantity Date Open
0 -20.0 2006-01-16 4567.00
1 -20.0 2006-01-17 NaN
2 -20.0 2006-01-18 NaN
3 20.0 2006-01-19 4506.00
4 20.0 2006-01-20 NaN
我尝试的是这段代码df1.Open = df1.loc[df1['Quantity'].isna(), 'Open'].fillna(df2.EntryPrice)
。 我尝试了这个,因为我确定 df2 中的日期包含在 df1 中的日期中并且在 df1.Quantity 中有 NaN 值。但是当我运行它时,这就是结果
Quantity Date Open
0 -20 2006-01-16 4567.0
1 -20.0 2006-01-17 NaN
2 -20.0 2006-01-18 NaN
3 20 2006-01-19 4609.8
4 20.0 2006-01-20 NaN
... ... ... ...
3317 -20.0 2017-05-23 NaN
3318 NaN 2017-05-23 NaN
3319 20.0 2017-05-24 NaN
3320 20.0 2017-05-25 NaN
3321 20.0 2017-05-26 NaN
如您所见,在第 3318 行,Quantity 和 Open 列中的 NaN 值仍未填充。 有人能帮我吗
在两个DataFrame
中创建DatetimeIndex
,然后在Open
中替换缺失值,仅用于过滤的行,然后Quantity
所有缺失的行:
df1 = df1.set_index('Date')
df2 = df2.set_index('Date')
mask = df1['Quantity'].isna()
df1.Open = df1.loc[mask, 'Open'].fillna(df2.Open)
df1.Quantity = df1['Quantity'].fillna(df2.Quantity)
df1 = df1.reset_index()
print (df1)
Date Quantity Open
0 2006-01-16 -20.0 4567.0
1 2006-01-17 -20.0 NaN
2 2006-01-18 -20.0 NaN
3 2006-01-19 20.0 4506.0
4 2006-01-20 20.0 NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.