I have a csv file with two columns, date and price. I want to create a 3rd column with the max value of "Price" for the last 5 days. Not the last 5 rows or index, but 5 days.
Content of "example.csv"
Date Price 2018-07-23 124.44 2018-07-24 125.49 2018-07-25 123.26 2018-07-31 124.08 2018-08-01 125.10 2018-08-04 121.41 2018-08-05 119.17 2018-08-06 118.58
It should look like this:
Date Price High5 2018-07-23 124.44 124.44 2018-07-24 125.49 125.49 2018-07-25 123.26 125.49 2018-07-31 124.08 124.08 2018-08-01 125.10 125.10 2018-08-04 121.41 125.10 2018-08-05 119.17 125.10 2018-08-06 118.58 121.41
With this code I get the max of the whole "Close" colum for every row.
import pandas as pd
df = pd.read_csv('example.csv', parse_dates=True, index_col=0)
df['High5'] = df['Close'].max()
print(df)
With this code I get the max of the last 5 days ending with 2018-08-06 for all rows.
import pandas as pd
df = pd.read_csv('example.csv', parse_dates=True, index_col=0)
rng = pd.date_range(end='2018-08-06', periods=5, freq='D')
df['High5'] = df['Price'].loc[rng].max()
print(df['High5'])
I don't want the same value for all rows. And I know that it's wrong to work with a fix (ending) date. But I don't know the answer with my beginners knowledge.
You are looking for rolling
df=df.set_index('Date')
df.index=pd.to_datetime(df.index)
df.rolling('5 D').max()
#df=df.rolling('5 D').max().reset_index()
Out[62]:
Price
Date
2018-07-23 124.44
2018-07-24 125.49
2018-07-25 125.49
2018-07-31 124.08
2018-08-01 125.10
2018-08-04 125.10
2018-08-05 125.10
2018-08-06 121.41
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.