Calculated column in Pandas Dataframe - calculating max of future values

Question

I'm trying to add a new column to a Pandas Dataframe that calculates the maximum value of all of the following records in the dataset, ie the maximum of the current row + 1, to the end of the dataset.

The dataset looks like this:

datetime	price	max_future_price
2021-02-25 10:00:00	10.00
2021-02-25 10:00:01	10.01
2021-02-25 10:00:02	10.00
2021-02-25 10:00:03	09.99

I am using a for loop and shift function (bad I know) but it was taking forever with larger datasets... is there a better / more scalable solution? I have spent a fair few hours searching and trying to trial and error my way through it with no luck. Thanks!

for row in range(len(df)):
    max_future_price = df.price.iloc[row+1:].max()
    max_future_return = round(((max_future_price - df.price.iloc[row])/df.price.iloc[row]),4)
    df.max_future_price.iloc[row] = max_future_return

Answer 1

You can revert your price column and use cummax to determine your max_future_price .

df['max_future_price'] = df.iloc[::-1, 'price'].cummax().values
df['max_future_return'] = df.max_future_price.subtract(df.price).divide(df.price)

Calculated column in Pandas Dataframe - calculating max of future values

Question

1 answers

solution1
0 2021-02-25 11:52:04

Calculated column in Pandas Dataframe - calculating max of future values

Question

1 answers

solution1 0 2021-02-25 11:52:04

solution1
0 2021-02-25 11:52:04