Python: how to groupby a pandas dataframe to count by hour and day?

Question

I have a dataframe like the following:

df.head(4)
    timestamp                  user_id   category
0  2017-09-23 15:00:00+00:00     A        Bar
1  2017-09-14 18:00:00+00:00     B        Restaurant
2  2017-09-30 00:00:00+00:00     B        Museum
3  2017-09-11 17:00:00+00:00     C        Museum

I would like to count for each hour for each the number of visitors for each category and have a dataframe like the following

df 
     year month day   hour   category   count
0    2017  9     11    0       Bar       2
1    2017  9     11    1       Bar       1
2    2017  9     11    2       Bar       0
3    2017  9     11    3       Bar       1

Answer 1

Assuming you want to groupby date and hour, you can use the following code if the timestamp column is a datetime column

df.year = df.timestamp.dt.year
df.month = df.timestamp.dt.month
df.day = df.timestamp.dt.day
df.hour = df.timestamp.dt.hour
grouped_data = df.groupby(['year','month','day','hour','category']).count()

Answer 2

For getting the count of user_id per hour per category you can use groupby with your datetime:

df.timestamp = pd.to_datetime(df['timestamp'])
df_new = df.groupby([df.timestamp.dt.year, 
                  df.timestamp.dt.month, 
                  df.timestamp.dt.day, 
                  df.timestamp.dt.hour, 
                  'category']).count()['user_id']
df_new.index.names = ['year', 'month', 'day', 'hour', 'category']
df_new = df_new.reset_index()

When you have a datetime in dataframe, you can use the dt accessor which allows you to access different parts of the datetime, ie year.

Python: how to groupby a pandas dataframe to count by hour and day?

Question

2 answers

solution1
0 2020-10-12 17:13:26

solution2
0 2020-10-12 17:15:07

Python: how to groupby a pandas dataframe to count by hour and day?

Question

2 answers

solution1 0 2020-10-12 17:13:26

solution2 0 2020-10-12 17:15:07

solution1
0 2020-10-12 17:13:26

solution2
0 2020-10-12 17:15:07