简体   繁体   English

在 Pandas 中如何用最接近的非 nan 值替换零值?

[英]in Pandas how to replace a zero value with the nearest non nan value?

I have a dataframe where the col look like :我有一个数据框,其中 col 看起来像:

NaN
859.0
NaN
NaN
0.0
NaN

and I would like to change the zero by the previous non NaN value, and don't change the other NaN,id get this :我想通过以前的非 NaN 值更改零,并且不要更改其他 NaN,id 得到这个:

NaN
859.0
NaN
NaN
859.0
NaN

I've tried replace with ffill, but can't manage to get the right output.我试过用填充替换,但无法获得正确的输出。

Any help welcome !欢迎任何帮助!

.ffill().shift() will propagate the last non-null value forward, and then you can just assign any rows with value = 0 to that: .ffill().shift()将向前传播最后一个非空值,然后您可以将 value = 0 的任何行分配给它:

In [42]: s.ffill().shift()
Out[42]:
0      NaN
1      NaN
2    859.0
3    859.0
4    859.0
5      0.0
dtype: float64

In [43]: s[s==0] = s.ffill().shift()

In [44]: s
Out[44]:
0      NaN
1    859.0
2      NaN
3      NaN
4    859.0
5      NaN
dtype: float64

First replace 0 to missing values, use ffill for forward filling missing values and last replace missing values back by Series.mask :首先将0替换为缺失值,使用ffill向前填充缺失值,最后用Series.mask替换缺失值:

df['col'] = df['col'].mask(df['col'].eq(0)).ffill().mask(df['col'].isna())
print (df)
     col
0    NaN
1  859.0
2    NaN
3    NaN
4  859.0
5    NaN

you could also do this with last_valid_index:你也可以用 last_valid_index 做到这一点:

say your column is in df['col']说你的列在df['col']

for i,_ in df.iterrows():
    if df.loc[i,'col'] == 0:
        df.at[i,'col'] = df.loc[df.loc[:i-1,'col'].last_valid_index(),'col']

output:输出:

     col
0    NaN
1  859.0
2    NaN
3    NaN
4  859.0
5    NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM