简体   繁体   中英

Calculated column in Pandas Dataframe - calculating max of future values

I'm trying to add a new column to a Pandas Dataframe that calculates the maximum value of all of the following records in the dataset, ie the maximum of the current row + 1, to the end of the dataset.

The dataset looks like this:

datetime price max_future_price
2021-02-25 10:00:00 10.00
2021-02-25 10:00:01 10.01
2021-02-25 10:00:02 10.00
2021-02-25 10:00:03 09.99

I am using a for loop and shift function (bad I know) but it was taking forever with larger datasets... is there a better / more scalable solution? I have spent a fair few hours searching and trying to trial and error my way through it with no luck. Thanks!

for row in range(len(df)):
    max_future_price = df.price.iloc[row+1:].max()
    max_future_return = round(((max_future_price - df.price.iloc[row])/df.price.iloc[row]),4)
    df.max_future_price.iloc[row] = max_future_return

You can revert your price column and use cummax to determine your max_future_price .

df['max_future_price'] = df.iloc[::-1, 'price'].cummax().values
df['max_future_return'] = df.max_future_price.subtract(df.price).divide(df.price)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM