简体   繁体   中英

What is the most efficient way to calculate a table in Python in which some columns depends on others and previous values?

I am developing a model and using a Pandas Dataframe as an input, each row represent a period for a given id. I need to calculate some columns (which would be the output of the model). The problem is that one colum is the function of other (D = F(A, fixed inputs) A is B t-1 (the value of B the previous period) and B is A - D. So the problem here is as each column depend on each other, and previous values the only way I found to resolve that is to iterate over the rows with itertuples(), but this way is too slow. I was wondering if there is a more efficient way to do this, perhaps without iteration.

This would be the simplified initial table (there are more columns and operations)

         Id  Period  MoneyInitial MoneyBoP Money_EoP Money_Paid
    0  0001    1       1000         0         0         0
    1  0001    2       1000         0         0         0
    2  0001    3       1000         0         0         0
    3  0001    4       1000         0         0         0
    4  0001    5       1000         0         0         0
    5  0001    6       1000         0         0         0
    6  0001    7       1000         0         0         0
    7  0001    8       1000         0         0         0
   

The desired output would be:

  • For the period 1 of each contract MoneyBoP would be equal to MoneyInitial , for the rest would be Money_EoP of the previous period.
  • Money_Paid is a function which takes MoneyBoP and other inputs (these are already calculated in the initial table)for the calculation
  • Money_EoP would be MoneyBoP + Money_Paid

So the desired output table would be:

         Id  Period  MoneyInitial MoneyBoP Money_EoP Money_Paid
    0  0001    1       1000         1000       900      -100
    1  0001    2       1000         900        850      -50
    2  0001    3       1000         850        700      -150
    3  0001    4       1000         700        600      -100
    4  0001    5       1000         600        450      -150
    5  0001    6       1000         450        300      -150
    6  0001    7       1000         150        50       -100
    7  0001    8       1000         50         0        -50
   

It looks like all the values can be calculated knowing the number of periods and MoneyInitial

#Some function to calculate MoneyPaid from BoP
def MoneyPaid(BoP):
    return round(-BoP * .1, 2)

#Calculate 
def Calculate_Data(start, n):
    d = [] # {'BoP' : [], 'EoP' : [], 'MP' : []}
    for i in range(0, n):
        bop = start
        mp = MoneyPaid(bop)
        start = start + mp
        d.append((bop,start,mp))
    return pd.DataFrame(d)

df[['MoneyBoP','Money_EoP','Money_Paid']] = Calculate_Data(df.iloc[0]['MoneyInitial'], len(df))

The result of this is

   Id  Period  MoneyInitial  MoneyBoP  Money_EoP  Money_Paid
0   1       1          1000   1000.00     900.00     -100.00
1   1       2          1000    900.00     810.00      -90.00
2   1       3          1000    810.00     729.00      -81.00
3   1       4          1000    729.00     656.10      -72.90
4   1       5          1000    656.10     590.49      -65.61
5   1       6          1000    590.49     531.44      -59.05
6   1       7          1000    531.44     478.30      -53.14
7   1       8          1000    478.30     430.47      -47.83

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM