简体   繁体   中英

How to vectorize an operation which uses previous values in Pandas

Based in this answer I am not able to solve the following question.

Is there a way to vectorize the Value End Of Period (VEoP) column?

import random
import pandas as pd

terms = pd.date_range(start = '2022-01-01', periods=12, freq='YS', normalize=True)
r = pd.DataFrame({
    'Return':   [1.063, 1.053, 1.008, 0.98, 1.04, 1.057, 1.073, 1.027, 1.025, 1.068, 1.001, 0.983],
    'Cashflow': [6, 0, 0, 8, -1, -1, -1, -1, -1, -1, -1, -1]
    },index=terms.strftime('%Y'))
r.index.name = 'Date'

r['VEoP'] = 0
for y in range(0,r.index.size):
    r['VEoP'].iloc[y] = ((0 if y==0 else r['VEoP'].iloc[y-1]) + r['Cashflow'].iloc[y]) * r['Return'].iloc[y]

r

    Return  Cashflow    VEoP
Date                          
2022  1.0630         6  6.3780
2023  1.0530         0  6.7160
2024  1.0080         0  6.7698
2025  0.9800         8 14.4744
2026  1.0400        -1 14.0133
2027  1.0570        -1 13.7551
2028  1.0730        -1 13.6862
2029  1.0270        -1 13.0288
2030  1.0250        -1 12.3295
2031  1.0680        -1 12.0999
2032  1.0010        -1 11.1110
2033  0.9830        -1  9.9391

Vectorization is limited when each value relies on the one before it, since it can't be parallelized.
Therefore your for loop may perform just as well as this "vectorization":

r['VEoP'] = np.frompyfunc(
    lambda prev, x: (prev + x.Cashflow) * x.Return,
    2, 1,  # nin, nout
).accumulate(
    [0, *r.to_records()],
    dtype=object,  # temporary conversion
).astype(float)[1:]

You can read more about np.frompyfunc and np.ufunc.accumulate .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM