[英]Fill NaN in both columns either values present
I have a two columns in df, sometimes it has NaN in either one column, sometimes in both columns. 我在df中有两列,有时它在任一列中都有NaN,有时在两列中都有。 I want to fill NaN with same value if any one of the columns values present.
如果存在任一列值,我想用相同的值填充NaN。
For ex, Input: 例如,输入:
col1 col2
0 3.375000 4.075000
1 2.450000 1.567100
2 NaN NaN
3 3.248083 NaN
4 NaN 2.335725
5 2.150000 3.218750
Output: 输出:
col1 col2
0 3.375000 4.075000
1 2.450000 1.567100
2 NaN NaN
3 3.248083 3.248083
4 2.335725 2.335725
5 2.150000 3.218750
For this I tried, 为此,我尝试了
print df.T.fillna(method='bfill').fillna(method='ffill').T
The above give me a required result, But I think I'm adding more complexity to my code. 上面的代码给了我所需的结果,但是我认为我为代码增加了更多的复杂性。 Is there any other better approach for this?
还有其他更好的方法吗?
You don't have to transpose, you can specify an axis: 您不必转置,您可以指定一个轴:
df.ffill(1).bfill(1)
col1 col2
0 3.375000 4.075000
1 2.450000 1.567100
2 NaN NaN
3 3.248083 3.248083
4 2.335725 2.335725
5 2.150000 3.218750
If you have multiple columns, but don't want to touch some of them, you can slice, fill, and assign back. 如果您有多列,但不想触及其中的一部分,则可以切片,填充和分配回来。
df
col1 col2 col3
0 3.375000 4.075000 NaN
1 2.450000 1.567100 2.0
2 NaN NaN 3.0
3 3.248083 NaN 5.0
4 NaN 2.335725 NaN
5 2.150000 3.218750 5.0
include = ['col1', 'col2']
# Or,
# exclude = ['col3']
# include = df.columns.difference(exclude)
df[include] = df[include].ffill(1).bfill(1)
df
col1 col2 col3
0 3.375000 4.075000 NaN
1 2.450000 1.567100 2.0
2 NaN NaN 3.0
3 3.248083 3.248083 5.0
4 2.335725 2.335725 NaN
5 2.150000 3.218750 5.0
If there are only two columns, you can also use combine_first
. 如果只有两列,则也可以使用
combine_first
。
df.col1 = df.col1.combine_first(df.col2)
df.col2 = df.col2.combine_first(df.col1)
col1 col2
0 3.375000 4.075000
1 2.450000 1.567100
2 NaN NaN
3 3.248083 3.248083
4 2.335725 2.335725
5 2.150000 3.218750
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.