[英]Adding simple moving average as an additional column to python DataFrame
我在sales_training.csv
中有如下销售数据:
time_period sales
1 127
2 253
3 123
4 253
5 157
6 105
7 244
8 157
9 130
10 221
11 132
12 265
我想添加包含移动平均值的第三列。 我的代码-
import pandas as pd
df = pd.read_csv("./Sales_training.csv", index_col="time_period")
periods = df.index.tolist()
period = int(input("Enter a period for the moving average :"))
sum1 = 0
for i in periods:
if i < period:
df['forecast'][i] = i
else:
for j in range(period):
sum1 += df['sales'][i-j]
df['forecast'][i] = sum1/period
sum1 = 0
print(df)
df.to_csv("./forecast_mannual.csv")
这在行df['forecast'][i] = i
处给出了KeyError: 'forecast'
。 有什么问题
一个简单的解决方案,只需df['forecast'] = df['sales']
import pandas as pd
df = pd.read_csv("./Sales_training.csv", index_col="time_period")
periods = df.index.tolist()
period = int(input("Enter a period for the moving average :"))
sum1 = 0
df['forecast'] = df['sales'] # add one line
for i in periods:
if i < period:
df['forecast'][i] = i
else:
for j in range(period):
sum1 += df['sales'][i-j]
df['forecast'][i] = sum1/period
sum1 = 0
print(df)
df.to_csv("./forecast_mannual.csv")
您的代码给出'keyerror'是因为引用'forecast'的列值的方式不正确。由于您的代码是第一次运行,因此'forecast'列尚未创建,当它尝试为第一次迭代引用df'forecast '时,它给出了关键错误。
在这里,我们的任务是更新动态创建的称为“预测”的新列中的值。 因此,您可以编写df.at [i,'forecast']代替df ['forecast'] [i]。
代码中还有另一个问题。当i的值小于周期时,您将分配``i''来预测我认为不正确的预测,在这种情况下它不应显示任何内容。
这是我的更正代码版本:
import pandas as pd df = pd.read_csv("./sales.csv", index_col="time_period") periods = df.index.tolist() period = int(input("Enter a period for the moving average :")) sum1 = 0 for i in periods: print(i) if i < period: df.at[i,'forecast'] = '' else: for j in range(period): sum1 += df['sales'][ij] df['forecast'][i] = sum1/period sum1 = 0 print(df) df.to_csv("./forecast_mannual.csv")
当我输入period = 2来计算移动平均值时的输出:
希望这可以帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.