正向填充熊猫列不具有最后一个值，但在非null和null元素上具有均值

Question

I experience this a lot in modeling time series. 我在建模时间序列中经历了很多。 Sometimes you may have data reported at different frequencies, say one daily and one weekly. 有时，您可能以不同的频率报告数据，例如每天一次和每周一次。 What I'd like is not to forward fill the weekly data point for every day of the week (since it is usually a sum of all the values of during the week already), but forward fill or replace the data with it's mean. 我不希望提前填充一周中每一天的每周数据点（因为它通常通常已经是一周中所有值的总和），而是向前填充或用平均值代替数据。 In essence, I'd like to spread out the data. 本质上，我想分散数据。

So if I have 所以如果我有

s = pd.Series(index=pd.date_range('2015/1/1', '2015/1/9'), 
             data=[2, np.nan, 6, np.nan, np.nan, 2, np.nan, np.nan, np.nan])

then I'd like to return 那我想回来

2015-01-01     1
2015-01-02     1
2015-01-03     2
2015-01-04     2
2015-01-05     2
2015-01-06   0.5
2015-01-07   0.5
2015-01-08   0.5
2015-01-09   0.5
Freq: D, dtype: float64

Any thoughts on an easy way to do this? 有什么简单的方法可以做到这一点吗？ Is a for-loop inescapable? for循环不可避免吗？

Answer 1

Here is one way using .cumcount to separate series into different groups and then transform . 这是使用.cumcount将序列分成不同的组然后进行transform 。

s.fillna(method='ffill').groupby(s.notnull().cumsum()).transform(lambda g: g/len(g))

2015-01-01    1.0
2015-01-02    1.0
2015-01-03    2.0
2015-01-04    2.0
2015-01-05    2.0
2015-01-06    0.5
2015-01-07    0.5
2015-01-08    0.5
2015-01-09    0.5
Freq: D, dtype: float64

正向填充熊猫列不具有最后一个值，但在非null和null元素上具有均值

问题描述

1 个解决方案

解决方案1
3 已采纳 2015-08-12 18:52:43

正向填充熊猫列不具有最后一个值，但在非null和null元素上具有均值

问题描述

1 个解决方案

解决方案1 3 已采纳 2015-08-12 18:52:43

解决方案1
3 已采纳 2015-08-12 18:52:43