简体   繁体   中英

Adding simple moving average as an additional column to python DataFrame

I have sales data in sales_training.csv that looks like this -

time_period sales
1   127
2   253
3   123
4   253
5   157
6   105
7   244
8   157
9   130
10  221
11  132
12  265

I want to add 3rd column that contains the moving average. My code -

import pandas as pd
df = pd.read_csv("./Sales_training.csv", index_col="time_period")
periods = df.index.tolist()
period = int(input("Enter a period for the moving average :"))
sum1 = 0
for i in periods:
    if i < period:
        df['forecast'][i] = i
    else:
        for j in range(period):
            sum1 += df['sales'][i-j]
        df['forecast'][i] = sum1/period
        sum1 = 0
print(df)
df.to_csv("./forecast_mannual.csv")

This is giving KeyError: 'forecast' at the line df['forecast'][i] = i . What is the issue?

one simple solution for it, just df['forecast'] = df['sales']

import pandas as pd
df = pd.read_csv("./Sales_training.csv", index_col="time_period")
periods = df.index.tolist()
period = int(input("Enter a period for the moving average :"))
sum1 = 0
df['forecast'] = df['sales'] # add one line
for i in periods:
    if i < period:
        df['forecast'][i] = i
    else:
        for j in range(period):
            sum1 += df['sales'][i-j]
        df['forecast'][i] = sum1/period
        sum1 = 0
print(df)
df.to_csv("./forecast_mannual.csv")

Your code is giving 'keyerror' because of incorrect way of referencing column value for 'forecast'.Because the first time your code runs,'forecast' column is not yet created and when it tries to reference df 'forecast' for first iteration then it gives key error.

Here,our task is to update values in dynamically created new column called 'forecast'. Therefore, instead of df['forecast'][i] you can write df.at[i,'forecast'].

There is another issue in the code.When value of i is less than period you are assigning 'i' to forecast which to my understanding is not correct.It should not display any thing in such case.

Here is my version of corrected code:

 import pandas as pd df = pd.read_csv("./sales.csv", index_col="time_period") periods = df.index.tolist() period = int(input("Enter a period for the moving average :")) sum1 = 0 for i in periods: print(i) if i < period: df.at[i,'forecast'] = '' else: for j in range(period): sum1 += df['sales'][ij] df['forecast'][i] = sum1/period sum1 = 0 print(df) df.to_csv("./forecast_mannual.csv") 

Output when I entered period=2 to calculate moving average:

在此处输入图片说明

Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM