简体   繁体   中英

Possible bug in pandas rolling mean when window = 1

In order to have a more generic notation in my code, I want to express my original time series as a moving average over 1 period. Quite unexpectedly, using pandas pd.rolling_mean function, the two are not exactly the same:

import pandas as pd
import numpy as np

np.random.seed(1)

ts = pd.Series(np.random.rand(1000))

mavg = pd.rolling_mean(ts, 1)

(ts - mavg).describe()
Out[120]: 
count    1.000000e+03
mean     6.284973e-16
std      3.877250e-16
min     -3.330669e-16
25%      3.330669e-16
50%      5.551115e-16
75%      8.881784e-16
max      1.554312e-15
dtype: float64

any((ts - mavg).dropna()>0)
Out[121]: True

Should this be considered a bug or am I missing something?

The numbers are very small and well in the range of numerical "noise" caused by how floats work. Floats cannot represent all numbers exactly. Therefore you will often have small "residuals" left when doing calculations with floats. Check against a small epsilon:

>>> any((ts - mavg).dropna().abs() > 1e-14)
False

The difference comes from the floating point calculations. Floats are not exactly the same when you do calculations due to the way how they are represented internally. Within these "rounding errors" your numbers are identical.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM