简体   繁体   中英

in pandas how can I groupby weekday() for a datetime column?

I'd like to filter out weekend data and only look at data for weekdays (mon(0)-fri(4)). I'm new to pandas, what's the best way to accomplish this in pandas?

import datetime
from pandas import *

data = read_csv("data.csv")
data.my_dt 

Out[52]:
0     2012-10-01 02:00:39
1     2012-10-01 02:00:38
2     2012-10-01 02:01:05
3     2012-10-01 02:01:07
4     2012-10-01 02:02:03
5     2012-10-01 02:02:09
6     2012-10-01 02:02:03
7     2012-10-01 02:02:35
8     2012-10-01 02:02:33
9     2012-10-01 02:03:01
10    2012-10-01 02:08:53
11    2012-10-01 02:09:04
12    2012-10-01 02:09:09
13    2012-10-01 02:10:20
14    2012-10-01 02:10:45
...

I'd like to do something like:

weekdays_only = data[data.my_dt.weekday() < 5]

AttributeError: 'numpy.int64' object has no attribute 'weekday'

but this doesn't work, I haven't quite grasped how column datetime objects are accessed.

The eventual goal being to arrange hierarchically to weekday hour-range, something like:

monday, 0-6, 7-12, 13-18, 19-23
tuesday, 0-6, 7-12, 13-18, 19-23

your call to the function "weekday" does not work as it operates on the index of data.my_dt, which is an int64 array (this is where the error message comes from)

you could create a new column in data containing the weekdays using something like:

data['weekday'] = data['my_dt'].apply(lambda x: x.weekday())

then you can filter for weekdays with:

weekdays_only = data[data['weekday'] < 5 ]

I hope this helps

Faster way would be to use DatetimeIndex.weekday , like so:

temp = pd.DatetimeIndex(data['my_dt'])
data['weekday'] = temp.weekday

Much much faster, especially for a large number of rows. For further info, check this answer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM