I have sales data till Jul-2020 and want to predict the next 3 months using a recovery rate. This is the dataframe:
test = pd.DataFrame({'Country':['USA','USA','USA','USA','USA'],
'Month':[6,7,8,9,10],
'Sales':[100,200,0,0,0],
'Recovery':[0,1,1.5,2.5,3]
})
This is how it looks:
Now, I want to add a "Predicted" column resulting into this dataframe:
The first value 300 at row 3, is basically (200 * 1.5/1)
. This will be our base value going ahead, so next value ie 500 is basically (300 * 2.5/1.5)
and so on. How do I iterate over row every row, starting from row 3 onwards? I tried using shift() but couldn't iterate over the rows.
You could do it like this:
import pandas as pd
test = pd.DataFrame({'Country':['USA','USA','USA','USA','USA'],
'Month':[6,7,8,9,10],
'Sales':[100,200,0,0,0],
'Recovery':[0,1,1.5,2.5,3]
})
test['Prediction'] = test['Sales']
for i in range(1, len(test)):
#prevent division by zero
if test.loc[i-1, 'Recovery'] != 0:
test.loc[i, 'Prediction'] = test.loc[i-1, 'Prediction'] * test.loc[i, 'Recovery'] / test.loc[i-1, 'Recovery']
The sequence you have is straight up just Recovery * base level (Sales = 200)
You can compute that sequence like this:
valid_sales = test.Sales > 0
prediction = (test.Recovery * test.Sales[valid_sales].iloc[-1]).rename("Predicted")
And then combine by index, insert column or concat:
pd.concat([test, prediction], axis=1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.