简体   繁体   English

Pandas - 无法计算移动平均线

[英]Pandas - unable to calculate moving average

I'm trying to follow this tutorial to calculate SMA: https://www.datacamp.com/community/tutorials/moving-averages-in-pandas我正在尝试按照本教程计算 SMA: https://www.datacamp.com/community/tutorials/moving-averages-in-pandas

I would like to get the SMA for all values but I'm only getting 5. I have 17 values in the frame that I want to get values for.我想获取所有值的 SMA,但我只得到 5。我想获取值的框架中有 17 个值。 If I increase the rolling window I am not getting any values at all for SMA, why is that?如果我增加滚动 window 我根本没有得到任何 SMA 值,这是为什么呢?

Thanks for any help as I'm new to Pandas感谢您的帮助,因为我是 Pandas 的新手

    def example(self):
    frame = {'date': ['2017-06-19', '2017-06-16', '2017-06-15', '2017-06-14', '2017-06-13', '2017-06-12', '2017-06-09', '2017-06-08', '2017-06-07', '2017-06-06', '2017-06-05', '2017-06-02', '2017-06-01', '2017-05-31'], 'indexes': ['146.3400', '142.2700', '144.2900', '145.1600', '146.5900', '145.4200', '148.9800', '154.9900', '155.3700', '154.4500', '153.9300', '155.4500', '153.1800', '152.7600']}

    df = pd.DataFrame(frame)
    df['SMA'] = df.iloc[:, 1].rolling(window=4).mean()
    print(df.head())

Output: Output:

         date   indexes     SMA
0  2017-06-19  146.3400       NaN
1  2017-06-17  142.2700       NaN
2  2017-06-16  144.2900       NaN
3  2017-06-15  145.1600  144.5150
4  2017-06-14  146.5900  144.5775

When calculating a moving average, you need n samples for it, which is the size of your moving window.计算移动平均线时,需要n样本,即移动 window 的大小。 Since you've set window=4 , then you need 4 samples for your average to be computed.由于您已设置window=4 ,因此您需要 4 个样本来计算平均值。 That being said, these NaN values just show that, in that point, there is not enough data to compute MA with window size = 4.话虽如此,这些NaN值只是表明,在这一点上,没有足够的数据来计算 window 大小 = 4 的 MA。

Solution解决方案

If you must calculate your rolling mean always with a window of 4, then you need to drop the results with 'NA' .如果您必须始终使用 4 的 window 来计算滚动平均值,那么您需要使用'NA'删除结果。 However, if what you want is just to calculate a rolling mean when you don't have enough observations, you could use something like df[column_name].rolling(window=4, min_periods=1) .但是,如果您只是在没有足够观察值时计算滚动平均值,则可以使用df[column_name].rolling(window=4, min_periods=1)的东西。 But note that this not not proper rolling mean.但请注意,这并不是不恰当的滚动方式。 Here's an example.这是一个例子。

Example例子

# Dummy data
df = pd.DataFrame(dates, columns=['Date'])
df['Counts'] = [16,  6,  8,  5, 15,  7]

# Calculate rolling mean with min_preriods=1
df['rolling_mean'] = df.Counts.rolling(window=4, min_periods=1).mean()
print(df)

Output : Output

        Date  Counts  rolling_mean
0 2020-01-01      16         16.00
1 2020-01-02       6         11.00
2 2020-01-03       8         10.00
3 2020-01-04       5          8.75
4 2020-01-05      15          8.50
5 2020-01-06       7          8.75

Dropping NA values删除NA

df.Counts.rolling(window=4).mean().dropna()

## Output
# 3    8.75
# 4    8.50
# 5    8.75
# Name: Counts, dtype: float64

Replacing NA values with some preferred value用一些首选值替换NA

Say, you want to replace all NA values with a 0 .假设您想用0替换所有NA值。 Here's what you need to do.这是你需要做的。

df.Counts.rolling(window=4).mean().fillna(0)

## Output
# 0    0.00
# 1    0.00
# 2    0.00
# 3    8.75
# 4    8.50
# 5    8.75
# Name: Counts, dtype: float64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM