I have this line of code which grabs the last value of the previous day and adds it repeated to the next day in a new column. Works fine.
df = df.join(df.resample('B', on='Date')['x'].last().rename('xnew'), on=pd.to_datetime((df['Date'] - pd.tseries.offsets.BusinessDay()).dt.date))
Now I need something similar but I can't get it working.
I need now the first value of the day in 'Open' and copy this value into every row in new column 'opening', for each day
I tried this but it doesn't work:
df = df.join(df.resample('B', on='Date')['Open'].last().rename('opening'), on=pd.to_datetime((df['Date'])))
error:
ValueError: columns overlap but no suffix specified: Index(['opening'], dtype='object')
How could I accomplish this?
With:
opening = df.resample('B', on='Date')['Open'].first()
Date
2019-06-20 2927.25
2019-06-21 2932.75
2019-06-24 2942.00
2019-06-25 2925.00
2019-06-26 2902.75
...
2020-06-17 3116.50
2020-06-18 3091.50
2020-06-19 3101.75
2020-06-22 3072.75
2020-06-23 3111.25
..I get the first values, and my desired output is
Date Open opening
1 2020-06-24 07:00:00 3091.50 3111.25
2 2020-06-24 07:05:00 3092.50 3111.25
3 2020-06-24 07:10:00 3090.25 3111.25
4 2020-06-24 07:15:00 3089.75 3111.25
Here's some sample data. The days are now from 7:00h to 7:15h for this example:
Time Open
Date
2019-06-20 07:00:00 70000 2927.25
2019-06-20 07:05:00 70500 2927.00
2019-06-20 07:10:00 71000 2927.00
2019-06-20 07:15:00 71500 2926.75
2019-06-21 07:00:00 70000 2932.75
2019-06-21 07:05:00 70500 2932.25
2019-06-21 07:10:00 71000 2933.00
2019-06-21 07:15:00 71500 2930.75
2019-06-24 07:00:00 70000 2942.00
2019-06-24 07:05:00 70500 2941.50
2019-06-24 07:10:00 71000 2942.00
2019-06-24 07:15:00 71500 2941.50
2019-06-25 07:00:00 70000 2925.00
2019-06-25 07:05:00 70500 2925.75
2019-06-25 07:10:00 71000 2926.50
2019-06-25 07:15:00 71500 2926.00
2019-06-26 07:00:00 70000 2902.75
2019-06-26 07:05:00 70500 2903.00
2019-06-26 07:10:00 71000 2904.00
2019-06-26 07:15:00 71500 2904.25
I started with a similar approach as yours by using resample
. The thing I added, is shifting all the values so that each will have the next day as Index. Then I can feed this values to Series.map
applied on the date.
Here is the code:
df['opening'] = df.Date.dt.date.map(df.resample('B', on='Date').Open.first().shift())
Date Open opening
0 2019-06-20 07:00:00 2927.25
1 2019-06-20 07:05:00 2927.0
2 2019-06-20 07:10:00 2927.0
3 2019-06-20 07:15:00 2926.75
4 2019-06-21 07:00:00 2932.75 2927.25
5 2019-06-21 07:05:00 2932.25 2927.25
6 2019-06-21 07:10:00 2933.0 2927.25
7 2019-06-21 07:15:00 2930.75 2927.25
8 2019-06-24 07:00:00 2942.0 2932.75
9 2019-06-24 07:05:00 2941.5 2932.75
10 2019-06-24 07:10:00 2942.0 2932.75
11 2019-06-24 07:15:00 2941.5 2932.75
12 2019-06-25 07:00:00 2925.0 2942.0
Of course, for the first day will have NaN.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.