简体   繁体   中英

Python: How to iterate over rows and calculate value based on previous row

I have sales data till Jul-2020 and want to predict the next 3 months using a recovery rate. This is the dataframe:

test = pd.DataFrame({'Country':['USA','USA','USA','USA','USA'],
             'Month':[6,7,8,9,10],
              'Sales':[100,200,0,0,0],
              'Recovery':[0,1,1.5,2.5,3]
             })

This is how it looks:

在此处输入图像描述

Now, I want to add a "Predicted" column resulting into this dataframe:

在此处输入图像描述

The first value 300 at row 3, is basically (200 * 1.5/1) . This will be our base value going ahead, so next value ie 500 is basically (300 * 2.5/1.5) and so on. How do I iterate over row every row, starting from row 3 onwards? I tried using shift() but couldn't iterate over the rows.

You could do it like this:

import pandas as pd
test = pd.DataFrame({'Country':['USA','USA','USA','USA','USA'],
             'Month':[6,7,8,9,10],
              'Sales':[100,200,0,0,0],
              'Recovery':[0,1,1.5,2.5,3]
             })

test['Prediction'] = test['Sales']
for i in range(1, len(test)):
    #prevent division by zero
    if test.loc[i-1, 'Recovery'] != 0:
        test.loc[i, 'Prediction'] = test.loc[i-1, 'Prediction'] * test.loc[i, 'Recovery'] / test.loc[i-1, 'Recovery']

The sequence you have is straight up just Recovery * base level (Sales = 200)

You can compute that sequence like this:

valid_sales = test.Sales > 0
prediction = (test.Recovery * test.Sales[valid_sales].iloc[-1]).rename("Predicted")

And then combine by index, insert column or concat:

pd.concat([test, prediction], axis=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM