Pandas - 无法计算移动平均线

Question

I'm trying to follow this tutorial to calculate SMA: https://www.datacamp.com/community/tutorials/moving-averages-in-pandas我正在尝试按照本教程计算 SMA： https://www.datacamp.com/community/tutorials/moving-averages-in-pandas

I would like to get the SMA for all values but I'm only getting 5. I have 17 values in the frame that I want to get values for.我想获取所有值的 SMA，但我只得到 5。我想获取值的框架中有 17 个值。 If I increase the rolling window I am not getting any values at all for SMA, why is that?如果我增加滚动 window 我根本没有得到任何 SMA 值，这是为什么呢？

Thanks for any help as I'm new to Pandas感谢您的帮助，因为我是 Pandas 的新手

    def example(self):
    frame = {'date': ['2017-06-19', '2017-06-16', '2017-06-15', '2017-06-14', '2017-06-13', '2017-06-12', '2017-06-09', '2017-06-08', '2017-06-07', '2017-06-06', '2017-06-05', '2017-06-02', '2017-06-01', '2017-05-31'], 'indexes': ['146.3400', '142.2700', '144.2900', '145.1600', '146.5900', '145.4200', '148.9800', '154.9900', '155.3700', '154.4500', '153.9300', '155.4500', '153.1800', '152.7600']}

    df = pd.DataFrame(frame)
    df['SMA'] = df.iloc[:, 1].rolling(window=4).mean()
    print(df.head())

Output: Output：

         date   indexes     SMA
0  2017-06-19  146.3400       NaN
1  2017-06-17  142.2700       NaN
2  2017-06-16  144.2900       NaN
3  2017-06-15  145.1600  144.5150
4  2017-06-14  146.5900  144.5775

Answer 1

When calculating a moving average, you need n samples for it, which is the size of your moving window.计算移动平均线时，需要n样本，即移动 window 的大小。 Since you've set window=4 , then you need 4 samples for your average to be computed.由于您已设置window=4 ，因此您需要 4 个样本来计算平均值。 That being said, these NaN values just show that, in that point, there is not enough data to compute MA with window size = 4.话虽如此，这些NaN值只是表明，在这一点上，没有足够的数据来计算 window 大小 = 4 的 MA。

Answer 2

Solution解决方案

If you must calculate your rolling mean always with a window of 4, then you need to drop the results with 'NA' .如果您必须始终使用 4 的 window 来计算滚动平均值，那么您需要使用'NA'删除结果。 However, if what you want is just to calculate a rolling mean when you don't have enough observations, you could use something like df[column_name].rolling(window=4, min_periods=1) .但是，如果您只是在没有足够观察值时计算滚动平均值，则可以使用df[column_name].rolling(window=4, min_periods=1)的东西。 But note that this not not proper rolling mean.但请注意，这并不是不恰当的滚动方式。 Here's an example.这是一个例子。

Example例子

# Dummy data
df = pd.DataFrame(dates, columns=['Date'])
df['Counts'] = [16,  6,  8,  5, 15,  7]

# Calculate rolling mean with min_preriods=1
df['rolling_mean'] = df.Counts.rolling(window=4, min_periods=1).mean()
print(df)

Output : Output ：

        Date  Counts  rolling_mean
0 2020-01-01      16         16.00
1 2020-01-02       6         11.00
2 2020-01-03       8         10.00
3 2020-01-04       5          8.75
4 2020-01-05      15          8.50
5 2020-01-06       7          8.75

Dropping `NA` values删除`NA`值

df.Counts.rolling(window=4).mean().dropna()

## Output
# 3    8.75
# 4    8.50
# 5    8.75
# Name: Counts, dtype: float64

Replacing `NA` values with some preferred value用一些首选值替换`NA`值

Say, you want to replace all NA values with a 0 .假设您想用0替换所有NA值。 Here's what you need to do.这是你需要做的。

df.Counts.rolling(window=4).mean().fillna(0)

## Output
# 0    0.00
# 1    0.00
# 2    0.00
# 3    8.75
# 4    8.50
# 5    8.75
# Name: Counts, dtype: float64

Pandas - 无法计算移动平均线

问题描述

2 个解决方案

解决方案1
0 已采纳 2020-07-29 21:30:53

解决方案2
0 2020-07-29 21:58:34

Solution解决方案

Example例子

Dropping `NA` values删除`NA`值

Replacing `NA` values with some preferred value用一些首选值替换`NA`值

Pandas - 无法计算移动平均线

问题描述

2 个解决方案

解决方案1 0 已采纳 2020-07-29 21:30:53

解决方案2 0 2020-07-29 21:58:34

Solution解决方案

Example例子

Dropping NA values删除NA值

Replacing NA values with some preferred value用一些首选值替换NA值

解决方案1
0 已采纳 2020-07-29 21:30:53

解决方案2
0 2020-07-29 21:58:34

Dropping `NA` values删除`NA`值

Replacing `NA` values with some preferred value用一些首选值替换`NA`值