[英]Pandas: Fill NaNs with next non-NaN / # consecutive NaNs
I'm looking to take a pandas series and fill NaN
with the average of the next numerical value where: average = next numerical value / (# consecutive NaNs + 1)
我正在寻找一个熊猫系列,并用下一个数值的平均值填充
NaN
,其中: average = next numerical value / (# consecutive NaNs + 1)
Here's my code so far, I just can't figure out how to divide the filler
column among the NaN
s (and the next numerical value as well) in num
: 到目前为止,这是我的代码,我只是无法弄清楚如何在
num
中将filler
列除以NaN
(以及下一个数值):
import pandas as pd
dates = pd.date_range(start = '1/1/2016',end = '1/12/2016', freq = 'D')
nums = [10, 12, None, None, 39, 10, 11, None, None, None, None, 60]
df = pd.DataFrame({
'date':dates,
'num':nums
})
df['filler'] = df['num'].fillna(method = 'bfill')
Current Output: 电流输出:
date num filler
0 2016-01-01 10.0 10.0
1 2016-01-02 12.0 12.0
2 2016-01-03 NaN 39.0
3 2016-01-04 NaN 39.0
4 2016-01-05 39.0 39.0
5 2016-01-06 10.0 10.0
6 2016-01-07 11.0 11.0
7 2016-01-08 NaN 60.0
8 2016-01-09 NaN 60.0
9 2016-01-10 NaN 60.0
10 2016-01-11 NaN 60.0
11 2016-01-12 60.0 60.0
Desired Output: 期望的输出:
date num
0 2016-01-01 10.0
1 2016-01-02 12.0
2 2016-01-03 13.0
3 2016-01-04 13.0
4 2016-01-05 13.0
5 2016-01-06 10.0
6 2016-01-07 11.0
7 2016-01-08 12.0
8 2016-01-09 12.0
9 2016-01-10 12.0
10 2016-01-11 12.0
11 2016-01-12 12.0
cumsum
of notnull
cumsum
notnull
groupby
and transform
with mean
groupby
并用mean
transform
csum = df.num.notnull()[::-1].cumsum()
filler = df.num.fillna(0).groupby(csum).transform('mean')
df.assign(filler=filler)
date num filler
0 2016-01-01 10.0 10.0
1 2016-01-02 12.0 12.0
2 2016-01-03 NaN 13.0
3 2016-01-04 NaN 13.0
4 2016-01-05 39.0 13.0
5 2016-01-06 10.0 10.0
6 2016-01-07 11.0 11.0
7 2016-01-08 NaN 12.0
8 2016-01-09 NaN 12.0
9 2016-01-10 NaN 12.0
10 2016-01-11 NaN 12.0
11 2016-01-12 60.0 12.0
how it works 这个怎么运作
df.num.notnull().cumsum()
is a standard technique to find groups of contiguous nulls. df.num.notnull().cumsum()
是一种查找连续空值组的标准技术。 However, I wanted my groups to end with the next numeric value. cumsum
'd. cumsum
。 transform
to broadcast across the existing index transform
为现有索引的广播 assign
new column. assign
新列。 Despite having reversed the series, the index will realign like magic. loc
but that overwrites the existing df
. loc
但是会覆盖现有的df
。 I'll let OP decide to overwrite if they want to.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.