简体   繁体   中英

How to reduce the runtime for pandas rolling taking too long run on multiple columns - pandas

I am working on a timeseries data. I am trying to apply the percentage change to the data.

Here is a snapshot of the data:

Time                     EX  SC      WH      YE Lt   Ub     Yl_2    Wm      Wm_2    value
2016-02-15 11:54:00 UTC 4.4 0.14    8.38    755 232 0.009   0.11    1428    1020    FALSE
2016-02-15 11:55:00 UTC 4.4 0.14    8.38    755 232 0.009   0.111   1436    1018    FALSE
2016-02-15 11:56:00 UTC 4.4 0.14    8.38    755 232 0.014   0.113   1471    1019    FALSE
2016-02-15 11:57:00 UTC 4.4 0.14    8.37    755 232 0.015   0.111   1457    1015    FALSE
2016-02-15 11:58:00 UTC 4.4 0.14    8.38    755 232 0.013   0.111   1476    1019    FALSE
2016-02-15 11:59:00 UTC 4.4 0.14    8.36    755 232 0.013   0.114   1416    1015    FALSE

The shape of the data is (122334, 10)

Here is my function:

def percent_change(series):
    # Collect all *but* the last value of this window, then the final value
    previous_values = series[:-1]
    last_value = series[-1]

    # Calculate the % difference between the last value and the mean of earlier values
    percent_change = (last_value - np.mean(previous_values)) / np.mean(previous_values)
    return percent_change

Applying the function here:

df2 = df.rolling(10).apply(percent_change)

Takes forever, please what am I doing wrong? Or how should I do it instead?

Thanks

Here is an approach that uses shift() and rolling() to compute the mean efficiently:

import pandas as pd

def rolling_pct_change(df, field):
    t = df.copy()
    t['mean'] = t['x'].shift(1).rolling(3).mean()
    t['pct_change'] = ((t['x'] - t['mean']) / t['mean'])
    return t

df = pd.DataFrame({'x': [*range(10)]})
df2 = rolling_pct_change(df, 'x')
print(df2)

   x  mean  pct_change
0  0   NaN         NaN
1  1   NaN         NaN
2  2   NaN         NaN
3  3   1.0    2.000000
4  4   2.0    1.000000
5  5   3.0    0.666667
6  6   4.0    0.500000
7  7   5.0    0.400000
8  8   6.0    0.333333
9  9   7.0    0.285714

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM