[英]Fill missing values of pandas based on values of other columns
I have a following dataframe: 我有以下数据框:
A B C D
0 NaN 2.0 NaN 0
1 3.0 4.0 NaN 1
2 NaN NaN NaN 5
3 NaN 3.0 NaN 4
Now I want to fill null values of A with the values in B or D. ie if the value is Null in B than check D. So resultant dataframe looks like this. 现在,我想用B或D中的值填充A的空值。即,如果B中的值为Null,则检查D。所以结果数据帧看起来像这样。
A B C D
0 2.0 2.0 NaN 0
1 3.0 4.0 NaN 1
2 5 NaN NaN 5
3 3.0 3.0 NaN 4
I can do this using following code: 我可以使用以下代码执行此操作:
df['A'] = df['A'].fillna(df['B'])
df['A'] = df['A'].fillna(df['D'])
But I want to do this in one line, how can I do that? 但是我想一行完成,该怎么做?
You could simply chain both .fillna()
: 您可以简单地链接两个
.fillna()
:
df['A'] = df.A.fillna(df.B).fillna(df.D)
A B C D
0 2.0 2.0 NaN 0
1 3.0 4.0 NaN 1
2 5.0 NaN NaN 5
3 3.0 3.0 NaN 4
Or using fillna
with combine_first
: 或将
fillna
与combine_first
fillna
使用:
df['A'] = df.A.fillna(df.B.combine_first(df.D))
If dont need chain because many columns better is use back filling missing values with seelcting first column by positions: 如果因为许多列更好而不需要链,则使用按位置填充第一列来回填缺失值:
df['A'] = df['A'].fillna(df[['B','D']].bfill(axis=1).iloc[:, 0])
print (df)
A B C D
0 2.0 2.0 NaN 0
1 3.0 4.0 NaN 1
2 5.0 NaN NaN 5
3 3.0 3.0 NaN 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.