简体   繁体   English

根据其他列的值填充熊猫的缺失值

[英]Fill missing values of pandas based on values of other columns

I have a following dataframe: 我有以下数据框:

    A    B   C   D
0  NaN  2.0 NaN  0
1  3.0  4.0 NaN  1
2  NaN  NaN NaN  5
3  NaN  3.0 NaN  4

Now I want to fill null values of A with the values in B or D. ie if the value is Null in B than check D. So resultant dataframe looks like this. 现在,我想用B或D中的值填充A的空值。即,如果B中的值为Null,则检查D。所以结果数据帧看起来像这样。

   A    B   C    D
0  2.0  2.0 NaN  0
1  3.0  4.0 NaN  1
2  5    NaN NaN  5
3  3.0  3.0 NaN  4

I can do this using following code: 我可以使用以下代码执行此操作:

df['A'] = df['A'].fillna(df['B'])
df['A'] = df['A'].fillna(df['D'])

But I want to do this in one line, how can I do that? 但是我想一行完成,该怎么做?

You could simply chain both .fillna() : 您可以简单地链接两个.fillna()

df['A'] = df.A.fillna(df.B).fillna(df.D)

    A    B   C   D
0  2.0  2.0 NaN  0
1  3.0  4.0 NaN  1
2  5.0  NaN NaN  5
3  3.0  3.0 NaN  4

Or using fillna with combine_first : 或将fillnacombine_first fillna使用:

df['A'] = df.A.fillna(df.B.combine_first(df.D))

If dont need chain because many columns better is use back filling missing values with seelcting first column by positions: 如果因为许多列更好而不需要链,则使用按位置填充第一列来回填缺失值:

df['A'] = df['A'].fillna(df[['B','D']].bfill(axis=1).iloc[:, 0])
print (df)
     A    B   C  D
0  2.0  2.0 NaN  0
1  3.0  4.0 NaN  1
2  5.0  NaN NaN  5
3  3.0  3.0 NaN  4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM