[英]Pandas DataFrame: how to fill `nan` with 0s but only nans existing between valid values?
What I'd like to do:我想做的事:
In [2]: b = pd.DataFrame({"a": [np.nan, 1, np.nan, 2, np.nan]})
Out[2]:
a
0 nan
1 1.000
2 nan
3 2.000
4 nan
Expected output:预期 output:
a
0 nan
1 1.000
2 0
3 2.000
4 nan
As you can see here, only nans that are surrounded by valid values are replaced with 0.正如您在此处看到的,只有被有效值包围的 nan 才会被 0 替换。
How can I do this?我怎样才能做到这一点?
df.interpolate(limit_area='inside')
looks good to me but it doesn't have an argument to fill with 0s... df.interpolate(limit_area='inside')
对我来说看起来不错,但它没有用 0 填充的参数...interpolate
, isna
, notna
and loc
interpolate
、 isna
、 notna
和loc
You can use interpolate
and then check which positions have NaN
in your original data, and which are filled in your interpolated, then replace those values with 0
:您可以使用
interpolate
然后检查原始数据中哪些位置具有NaN
,哪些位置填充了插值,然后将这些值替换为0
:
s = df['a'].interpolate(limit_area='inside')
m1 = b['a'].isna()
m2 = s.notna()
df.loc[m1&m2, 'a'] = 0
a
0 NaN
1 1.0
2 0.0
3 2.0
4 NaN
shift
and loc
:shift
和loc
: An easier method would be to check if previous row and next row are not NaN
and fill those positions with 0
:一种更简单的方法是检查前一行和下一行是否
not NaN
并用0
填充这些位置:
m1 = df['a'].shift().notna()
m2 = df['a'].shift(-1).notna()
m3 = df['a'].isna()
df.loc[m1&m2&m3, 'a'] = 0
a
0 NaN
1 1.0
2 0.0
3 2.0
4 NaN
b = pd.DataFrame({"a": [np.nan, 1, np.nan, 2, np.nan,3,np.nan]})
a = b[b['a'].isna()]
print('After :',b['a'])
#######Solution One######
for x in a.iterrows() :
pre = x[0] - 1
post = x[0] +1
if pre < 0 or post >= len(b['a']) :
pass
else :
if not(np.isnan(b.iloc[pre,0])) and not(np.isnan(b.iloc[post,0])) :
b.iloc[x[0],0] = 0
print('Before :',b['a'])
######Solution Two#######
def series_extract(index, series):
return map(np.isnan, series[[index-1, index, index+1]])
def fill_in_between_na(df, column):
series = df[column]
index = []
for i in range(1,len(series)-1) :
mask = np.array([False,True,False]) == np.array(series_extract(i, series))
if all(mask):
index.append(i)
df[column][index] = 0
return df
fill_in_between_na(b, 'a')
print('Before :',b['a'])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.