简体   繁体   中英

Pandas - Calculate rolling cumulative product with variable window size

I have some financial time series data that I would like to calculate the rolling cumulative product with a variable window size.

What I am trying to accomplish is using the following formula but instead of having window fixed at 12, I would like to use the value stored in the last column of the dataframe labeled 'labels_y' which will change over time.

df= (1 + df).rolling(window=12).apply(np.prod, raw=True) - 1

A sample of the data:

Out[102]: 
            div_yield  earn_variab    growth  ...     value  volatility  labels_y
date                                          ...                                
2004-02-23  -0.001847     0.003252 -0.001264  ...  0.004368   -0.004490       2.0
2004-02-24  -0.001668     0.007404  0.002108  ... -0.006122    0.008183       2.0
2004-02-25  -0.003272     0.004596  0.001283  ... -0.002057    0.005912       3.0
2004-02-26   0.001818    -0.003397 -0.003190  ...  0.001327   -0.003908       3.0
2004-02-27  -0.002838     0.009879  0.000808  ...  0.000350    0.010557       3.0
[5 rows x 11 columns]

and the final result should look like:

Out[104]: 
            div_yield  earn_variab    growth  ...     value  volatility  labels_y
date                                          ...                                
2004-02-23        NaN          NaN       NaN  ...       NaN         NaN       NaN
2004-02-24  -0.003512     0.010680  0.000841  ... -0.001781    0.003656       8.0
2004-02-25  -0.006773     0.015325  0.002125  ... -0.003834    0.009589      35.0
2004-02-26  -0.003126     0.008596  0.000193  ... -0.006851    0.010180      47.0
2004-02-27  -0.004294     0.011075 -0.001104  ... -0.000383    0.012559      63.0
[5 rows x 11 columns]

Rows 1 and 2 are calculated with a 2 day rolling window and rows 3, 4 and 5 use a 3 day window

I have tried using

def get_window(row):
    return (1 + row).rolling(window=int(row['labels_y'])).apply(np.prod, raw=True) - 1

df = df.apply(get_window, axis=1)

I realize that calculates the cumupative product in the wrong direction. I am struggling on how to get this to work.

Any help would be hugely appreciated.

Thanks


def get_window(row, df):
    return (1 + df).rolling(window=int(row['labels_y'])).apply(np.prod, raw=True).loc[row.name]-1


result = df1.apply(get_window, axis=1, df=df1)

Does this do the trick? Highly inefficient, but I don't see another way except for tedious for-loops.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM