Python熊猫移动平均滞后

Question

Consider the following Python program: 考虑以下Python程序：

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = [["2017-05-25 22:00:00", 5],
["2017-05-25 22:05:00", 7],
["2017-05-25 22:10:00", 9],
["2017-05-25 22:15:00", 10],
["2017-05-25 22:20:00", 15],
["2017-05-25 22:25:00", 20],
["2017-05-25 22:30:00", 25],
["2017-05-25 22:35:00", 32]]

df = pd.DataFrame(data)
df.columns = ["date", "value"]
df["date2"] = pd.to_datetime(df["date"],format="%Y-%m-%d %H:%M:%S")

ts = pd.Series(df["value"].values, index=df["date2"])
mean_smoothed = ts.rolling(window=5).mean()
exp_smoothed = ts.ewm(alpha=0.5).mean()

h1 = ts.head(8)
h2 = mean_smoothed.head(8)
h3 = exp_smoothed.head(8)
k = pd.concat([h1, h2, h3], join='outer', axis=1)
k.columns = ["Actual", "Moving Average", "Exp Smoothing"]
print(k)

This prints 此打印

                     Actual  Moving Average  Exp Smoothing
date2                                                     
2017-05-25 22:00:00       5             NaN       5.000000
2017-05-25 22:05:00       7             NaN       6.333333
2017-05-25 22:10:00       9             NaN       7.857143
2017-05-25 22:15:00      10             NaN       9.000000
2017-05-25 22:20:00      15             9.2      12.096774
2017-05-25 22:25:00      20            12.2      16.111111
2017-05-25 22:30:00      25            15.8      20.590551
2017-05-25 22:35:00      32            20.4      26.317647

Drawing a graph 画图

plt.figure(figsize=(16,5))
plt.plot(ts, label="Original")
plt.plot(mean_smoothed, label="Moving Average")
plt.plot(exp_smoothed, label="Exponentially Weighted Average")
plt.legend()
plt.show()

Both moving average (MA) and exponential smoothing (ES) introduce a lag: In the above example MA, needs 5 values to make a predication what the 6th value will be. 移动平均值（MA）和指数平滑（ES）都引入了滞后：在上面的示例MA中，需要5个值来预测第6个值。 If you look at the table, however, there are only 4 NaN values in the MA column, and the 5th value is already a non-NaN value (=the first prediction). 但是，如果您查看该表，则MA列中只有4个NaN值，而第5个值已经是非NaN值（=第一个预测）。

Question: How do I draw these values in a graph such that the lag is correctly preserved? 问题：如何在图形中绘制这些值，以便正确保留滞后？ Looking at ES, it is actually a bit more obvious: ES should start at t=2 but starts but starts immediatelly. 从ES来看，它实际上更为明显：ES应该从t = 2开始，但是应该立即开始。

Answer 1

You seem to have misunderstood Moving Averages. 您似乎对移动平均线有误解。 For a MA(5), it need 5 data points to calculate. 对于MA（5），需要5个数据点进行计算。 Once you receive the 5th point, an average can be calculated for the 5th point using points 1-5. 收到第5点后，可以使用第1-5点计算第5点的平均值。 Therefore you should only have 4 NaNs. 因此，您应该只有4个NaN。

If you want to shift your data, you can try: 如果要转移数据，可以尝试：

df.shift(n) # n is an integer

Either shift Actual by -1, or shift everything by 1. 将“实际值”移位-1或将所有值移位1。

Here is the docs for it. 这是它的文档。

Answer 2

Interpolation should fix the issue. 插值应解决此问题。

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = [["2017-05-25 22:00:00", 5],
["2017-05-25 22:05:00", 7],
["2017-05-25 22:10:00", 9],
["2017-05-25 22:15:00", 10],
["2017-05-25 22:20:00", 15],
["2017-05-25 22:25:00", 20],
["2017-05-25 22:30:00", 25],
["2017-05-25 22:35:00", 32]]

df = pd.DataFrame(data)
df.columns = ["date", "value"]
df["date2"] = pd.to_datetime(df["date"],format="%Y-%m-%d %H:%M:%S")

ts = pd.Series(df["value"].values, index=df["date2"])
mean_smoothed = ts.rolling(window=5).mean()
###### NEW #########
mean_smoothed[0]=ts[0]
mean_smoothed.interpolate(inplace=True)
####################
exp_smoothed = ts.ewm(alpha=0.5).mean()

h1 = ts.head(8)
h2 = mean_smoothed.head(8)
h3 = exp_smoothed.head(8)
k = pd.concat([h1, h2, h3], join='outer', axis=1)
k.columns = ["Actual", "Moving Average", "Exp Smoothing"]
print(k)


plt.figure(figsize=(16,5))
plt.plot(ts, label="Original")
plt.plot(mean_smoothed, label="Moving Average")
plt.plot(exp_smoothed, label="Exponentially Weighted Average")
plt.legend()
plt.show()

Python熊猫移动平均滞后

问题描述

2 个解决方案

解决方案1
0 2017-08-26 08:54:21

解决方案2
0 2017-08-27 10:01:54

Python熊猫移动平均滞后

问题描述

2 个解决方案

解决方案1 0 2017-08-26 08:54:21

解决方案2 0 2017-08-27 10:01:54

解决方案1
0 2017-08-26 08:54:21

解决方案2
0 2017-08-27 10:01:54