简体   繁体   中英

Pandas: Rolling Mean and ignore NaN

How does you tell pandas to ignore NaN values when calculating a mean? With min periods, pandas will return NaN for a number of min_periods when it encounters a single NaN .

Example:

pd.DataFrame({ 'x': [np.nan, 0, 1, 2, 3, np.nan, 5, 6, 7, 8, 9]}).rolling(3, min_periods = 3).mean()

Result:

-1  NaN
0   NaN
1   NaN
2   1.0
3   2.0
4   NaN
5   NaN
6   NaN
7   6.0
8   7.0
9   8.0

Desired Result:

-1  NaN
0   NaN
1   NaN
2   1.0
3   2.0
4   2.0
5   3.3
6   4.6
7   6.0
8   7.0
9   8.0

You want to drop the np.nan first then rolling mean. Afterwards, reindex with the original index and forward fill values to fill the np.nan .

df.x.dropna().rolling(3).mean().reindex(df.index, method='pad')

0          NaN
1          NaN
2          NaN
3     1.000000
4     2.000000
5     2.000000
6     3.333333
7     4.666667
8     6.000000
9     7.000000
10    8.000000
Name: x, dtype: float64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM