简体   繁体   English

将简单的移动平均值作为附加列添加到python DataFrame

[英]Adding simple moving average as an additional column to python DataFrame

I have sales data in sales_training.csv that looks like this - 我在sales_training.csv中有如下销售数据:

time_period sales
1   127
2   253
3   123
4   253
5   157
6   105
7   244
8   157
9   130
10  221
11  132
12  265

I want to add 3rd column that contains the moving average. 我想添加包含移动平均值的第三列。 My code - 我的代码-

import pandas as pd
df = pd.read_csv("./Sales_training.csv", index_col="time_period")
periods = df.index.tolist()
period = int(input("Enter a period for the moving average :"))
sum1 = 0
for i in periods:
    if i < period:
        df['forecast'][i] = i
    else:
        for j in range(period):
            sum1 += df['sales'][i-j]
        df['forecast'][i] = sum1/period
        sum1 = 0
print(df)
df.to_csv("./forecast_mannual.csv")

This is giving KeyError: 'forecast' at the line df['forecast'][i] = i . 这在行df['forecast'][i] = i处给出了KeyError: 'forecast' What is the issue? 有什么问题

one simple solution for it, just df['forecast'] = df['sales'] 一个简单的解决方案,只需df['forecast'] = df['sales']

import pandas as pd
df = pd.read_csv("./Sales_training.csv", index_col="time_period")
periods = df.index.tolist()
period = int(input("Enter a period for the moving average :"))
sum1 = 0
df['forecast'] = df['sales'] # add one line
for i in periods:
    if i < period:
        df['forecast'][i] = i
    else:
        for j in range(period):
            sum1 += df['sales'][i-j]
        df['forecast'][i] = sum1/period
        sum1 = 0
print(df)
df.to_csv("./forecast_mannual.csv")

Your code is giving 'keyerror' because of incorrect way of referencing column value for 'forecast'.Because the first time your code runs,'forecast' column is not yet created and when it tries to reference df 'forecast' for first iteration then it gives key error. 您的代码给出'keyerror'是因为引用'forecast'的列值的方式不正确。由于您的代码是第一次运行,因此'forecast'列尚未创建,当它尝试为第一次迭代引用df'forecast '时,它给出了关键错误。

Here,our task is to update values in dynamically created new column called 'forecast'. 在这里,我们的任务是更新动态创建的称为“预测”的新列中的值。 Therefore, instead of df['forecast'][i] you can write df.at[i,'forecast']. 因此,您可以编写df.at [i,'forecast']代替df ['forecast'] [i]。

There is another issue in the code.When value of i is less than period you are assigning 'i' to forecast which to my understanding is not correct.It should not display any thing in such case. 代码中还有另一个问题。当i的值小于周期时,您将分配``i''来预测我认为不正确的预测,在这种情况下它不应显示任何内容。

Here is my version of corrected code: 这是我的更正代码版本:

 import pandas as pd df = pd.read_csv("./sales.csv", index_col="time_period") periods = df.index.tolist() period = int(input("Enter a period for the moving average :")) sum1 = 0 for i in periods: print(i) if i < period: df.at[i,'forecast'] = '' else: for j in range(period): sum1 += df['sales'][ij] df['forecast'][i] = sum1/period sum1 = 0 print(df) df.to_csv("./forecast_mannual.csv") 

Output when I entered period=2 to calculate moving average: 当我输入period = 2来计算移动平均值时的输出:

在此处输入图片说明

Hope this helps. 希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM