[英]Pandas - unable to calculate moving average
I'm trying to follow this tutorial to calculate SMA: https://www.datacamp.com/community/tutorials/moving-averages-in-pandas我正在尝试按照本教程计算 SMA: https://www.datacamp.com/community/tutorials/moving-averages-in-pandas
I would like to get the SMA for all values but I'm only getting 5. I have 17 values in the frame that I want to get values for.我想获取所有值的 SMA,但我只得到 5。我想获取值的框架中有 17 个值。 If I increase the rolling window I am not getting any values at all for SMA, why is that?
如果我增加滚动 window 我根本没有得到任何 SMA 值,这是为什么呢?
Thanks for any help as I'm new to Pandas感谢您的帮助,因为我是 Pandas 的新手
def example(self):
frame = {'date': ['2017-06-19', '2017-06-16', '2017-06-15', '2017-06-14', '2017-06-13', '2017-06-12', '2017-06-09', '2017-06-08', '2017-06-07', '2017-06-06', '2017-06-05', '2017-06-02', '2017-06-01', '2017-05-31'], 'indexes': ['146.3400', '142.2700', '144.2900', '145.1600', '146.5900', '145.4200', '148.9800', '154.9900', '155.3700', '154.4500', '153.9300', '155.4500', '153.1800', '152.7600']}
df = pd.DataFrame(frame)
df['SMA'] = df.iloc[:, 1].rolling(window=4).mean()
print(df.head())
Output: Output:
date indexes SMA
0 2017-06-19 146.3400 NaN
1 2017-06-17 142.2700 NaN
2 2017-06-16 144.2900 NaN
3 2017-06-15 145.1600 144.5150
4 2017-06-14 146.5900 144.5775
When calculating a moving average, you need n
samples for it, which is the size of your moving window.计算移动平均线时,需要
n
样本,即移动 window 的大小。 Since you've set window=4
, then you need 4 samples for your average to be computed.由于您已设置
window=4
,因此您需要 4 个样本来计算平均值。 That being said, these NaN
values just show that, in that point, there is not enough data to compute MA with window size = 4.话虽如此,这些
NaN
值只是表明,在这一点上,没有足够的数据来计算 window 大小 = 4 的 MA。
If you must calculate your rolling mean always with a window of 4, then you need to drop the results with 'NA'
.如果您必须始终使用 4 的 window 来计算滚动平均值,那么您需要使用
'NA'
删除结果。 However, if what you want is just to calculate a rolling mean when you don't have enough observations, you could use something like df[column_name].rolling(window=4, min_periods=1)
.但是,如果您只是在没有足够观察值时计算滚动平均值,则可以使用
df[column_name].rolling(window=4, min_periods=1)
的东西。 But note that this not not proper rolling mean.但请注意,这并不是不恰当的滚动方式。 Here's an example.
这是一个例子。
# Dummy data
df = pd.DataFrame(dates, columns=['Date'])
df['Counts'] = [16, 6, 8, 5, 15, 7]
# Calculate rolling mean with min_preriods=1
df['rolling_mean'] = df.Counts.rolling(window=4, min_periods=1).mean()
print(df)
Output : Output :
Date Counts rolling_mean
0 2020-01-01 16 16.00
1 2020-01-02 6 11.00
2 2020-01-03 8 10.00
3 2020-01-04 5 8.75
4 2020-01-05 15 8.50
5 2020-01-06 7 8.75
NA
valuesNA
值df.Counts.rolling(window=4).mean().dropna()
## Output
# 3 8.75
# 4 8.50
# 5 8.75
# Name: Counts, dtype: float64
NA
values with some preferred valueNA
值Say, you want to replace all NA
values with a 0
.假设您想用
0
替换所有NA
值。 Here's what you need to do.这是你需要做的。
df.Counts.rolling(window=4).mean().fillna(0)
## Output
# 0 0.00
# 1 0.00
# 2 0.00
# 3 8.75
# 4 8.50
# 5 8.75
# Name: Counts, dtype: float64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.