I have a series of data with half hour intervals. I need to take a rolling 4 (whole) week average of Tuesday, Wednesday, and Thursday, for each half hour interval in the dataset. So the first 'window' would have averages for times 00:00:00, 00:30:00,...,23:00:00, 23:30:00 for weeks 1-4. Then the next window would have averages for weeks 2-5 etc.
I have the following dataset, which has daily data, but only including Tuesday, Wednesday, and Thursday (for whatever reason other days are not used in calculating the averages). Furthermore, within those days, I have data in half hour intervals (but only including half hour time intervals of 00:00:00, 00:30:00, 01:00:00, and 01:30:00 in the sample).
datetime timeblock speed
1/3/2017 0:00 0:00:00 81.186885
1/3/2017 0:30 0:30:00 NaN
1/3/2017 1:00 1:00:00 85.277724
1/3/2017 1:30 1:30:00 85.077176
1/4/2017 0:00 0:00:00 80.691608
1/4/2017 0:30 0:30:00 79.223225
1/4/2017 1:00 1:00:00 82.330169
1/4/2017 1:30 1:30:00 79.495578
1/5/2017 0:00 0:00:00 74.162426
1/5/2017 0:30 0:30:00 75.206492
1/5/2017 1:00 1:00:00 77.6484
1/5/2017 1:30 1:30:00 72.61875
1/10/2017 0:00 0:00:00 77.785555
1/10/2017 0:30 0:30:00 80.617395
1/10/2017 1:00 1:00:00 80.094947
1/10/2017 1:30 1:30:00 77.697473
1/11/2017 0:00 0:00:00 74.7104
1/11/2017 0:30 0:30:00 75.691326
1/11/2017 1:00 1:00:00 74.639803
1/11/2017 1:30 1:30:00 81.797268
1/12/2017 0:00 0:00:00 79.571042
1/12/2017 0:30 0:30:00 78.083612
1/12/2017 1:00 1:00:00 78.747287
1/12/2017 1:30 1:30:00 78.128129
1/17/2017 0:00 0:00:00 76.509323
1/17/2017 0:30 0:30:00 77.256
1/17/2017 1:00 1:00:00 78.627085
1/17/2017 1:30 1:30:00 81.588
1/18/2017 0:00 0:00:00 77.82543
1/18/2017 0:30 0:30:00 80.231272
1/18/2017 1:00 1:00:00 NaN
1/18/2017 1:30 1:30:00 74.656384
1/19/2017 0:00 0:00:00 77.37165
1/19/2017 0:30 0:30:00 80.328705
1/19/2017 1:00 1:00:00 80.011531
1/19/2017 1:30 1:30:00 79.643781
1/24/2017 0:00 0:00:00 81.167016
1/24/2017 0:30 0:30:00 NaN
1/24/2017 1:00 1:00:00 83.128695
1/24/2017 1:30 1:30:00 77.799428
1/25/2017 0:00 0:00:00 73.106437
1/25/2017 0:30 0:30:00 71.316
1/25/2017 1:00 1:00:00 75.966
1/25/2017 1:30 1:30:00 74.345225
1/26/2017 0:00 0:00:00 78.768
1/26/2017 0:30 0:30:00 80.436508
1/26/2017 1:00 1:00:00 76.782222
1/26/2017 1:30 1:30:00 76.168687
1/31/2017 0:00 0:00:00 73.780363
1/31/2017 0:30 0:30:00 72.32356
1/31/2017 1:00 1:00:00 74.119404
1/31/2017 1:30 1:30:00 72.412363
2/1/2017 0:00 0:00:00 75.572408
2/1/2017 0:30 0:30:00 72.486593
2/1/2017 1:00 1:00:00 77.357
2/1/2017 1:30 1:30:00 74.134188
2/2/2017 0:00 0:00:00 72.209382
2/2/2017 0:30 0:30:00 75.792807
2/2/2017 1:00 1:00:00 74.167605
2/2/2017 1:30 1:30:00 78.053373
I've tried the following code, but it does not give the desired results:
roll_mean = sample.groupby('timeblock')['speed'].rolling('30D', min_value = '30D').mean()
The desired results should be the following:
Window 00:00:00 00:30:00 01:00:00 01:30:00
1 (wks 1-4) 77.74 NaN NaN 78.25
2 (wks 2-5) 76.53 NaN NaN 77.20
Thank you in advance
Edit: Grammar/clarification
In[1]: sample.index
Out[1]:
DatetimeIndex(['2017-01-03 00:00:00', '2017-01-03 00:30:00',
'2017-01-03 01:00:00', '2017-01-03 01:30:00',
'2017-01-03 02:00:00', '2017-01-03 02:30:00',
'2017-01-03 03:00:00', '2017-01-03 03:30:00',
'2017-01-03 04:00:00', '2017-01-03 04:30:00',
...
'2017-12-28 19:00:00', '2017-12-28 19:30:00',
'2017-12-28 20:00:00', '2017-12-28 20:30:00',
'2017-12-28 21:00:00', '2017-12-28 21:30:00',
'2017-12-28 22:00:00', '2017-12-28 22:30:00',
'2017-12-28 23:00:00', '2017-12-28 23:30:00'],
dtype='datetime64[ns]', name='datetime', length=7488, freq=None)
In[2]: sample.dtypes
Out[3]:
timeblock object
speed float64
dtype: object
So I was able to get the results I needed.
toll = pd.pivot_table(toll, columns='timeblock',index='date', values='speed')
toll = toll.resample('W').mean().rolling(4).mean()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.