在 pandas 數據幀中按時間間隔計算出現次數

Question

我有這個簡單的數據框：

 Date and time        Event 
 --------------------------
 2020-03-23 9:05:03    A
 2020-03-23 14:06:02   B
 2020-03-23 9:06:43C   B
 2020-03-23 12:11:50   D
 2020-03-23 12:12:38   D
 2020-03-23 12:13:17   B
 2020-03-23 12:14:07   A
 2020-03-23 12:14:54   A
 2020-04-29 10:37:09   A
 2020-04-29 10:39:13   A
 2020-04-29 11:53:33   A
 2020-04-29 12:04:46   C
 2020-04-30 19:15:29   D
 2020-04-30 16:18:4    B

我想在 4H 小時的時間間隔內計算Event中的出現次數並創建一個新的數據框。

我試圖得到這樣的東西：

   10:00-14:00  14:00-18:00  18:00-22:00  22:00-02:00
A       2            1            3             0
B       0            1            1             2
C       1            2            1             1
D       0            0            0             2

我嘗試過使用重采樣進行聚合，然后從DateTime中提取Time ，然后應用計數，我還嘗試了使用pd.TimeGrouper()的不同組合，但所有這些似乎都不起作用。 我不知道如何設置那些 4 小時的時間間隔，所以我可以應用聚合。

此時，我已經搜索了所有相關帖子，但找不到解決方案。

任何建議將不勝感激。

Answer 1

您可以嘗試時間箱：

df['Date and time'] = pd.to_datetime(df['Date and time'])
bins = [10, 14, 18, 20, 24]
labels = ['10:00-14:00','14:00-18:00','18:00-20:00','20:00-24:00']
df['TimeBin'] = pd.cut(df['Date and time'].dt.hour, bins, labels=labels, right=False)
result = df.pivot_table(index= ['Event'], columns=['TimeBin'], aggfunc='count')

Answer 2

這是使用 pandas .groupby() 、 .explode()和'.pivot_table()

>>> import pandas as pd
>>> df = pd.DataFrame([i.strip().split('   ') for i in '''  2020-03-23 9:05:03   A
...  2020-03-23 14:06:02   B
...  2020-03-23 9:06:43   B
...  2020-03-23 12:11:50   D
...  2020-03-23 12:12:38   D
...  2020-03-23 12:13:17   B
...  2020-03-23 12:14:07   A
...  2020-03-23 12:14:54   A
...  2020-04-29 10:37:09   A
...  2020-04-29 10:39:13   A
...  2020-04-29 11:53:33   A
...  2020-04-29 12:04:46   C
...  2020-04-30 19:15:29   D
...  2020-04-30 16:18:04   B '''.split('\n')], columns=['Date and time', 'Event'])
>>> df
          Date and time Event
0    2020-03-23 9:05:03     A
1   2020-03-23 14:06:02     B
2    2020-03-23 9:06:43     B
3   2020-03-23 12:11:50     D
4   2020-03-23 12:12:38     D
5   2020-03-23 12:13:17     B
6   2020-03-23 12:14:07     A
7   2020-03-23 12:14:54     A
8   2020-04-29 10:37:09     A
9   2020-04-29 10:39:13     A
10  2020-04-29 11:53:33     A
11  2020-04-29 12:04:46     C
12  2020-04-30 19:15:29     D
13  2020-04-30 16:18:04     B
>>> # convert Date and time column to datetime type
>>> df['Date and time'] = pd.to_datetime(df['Date and time'])
>>> # groupby based on freq 4H
>>> df = df.groupby(pd.Grouper(key='Date and time', freq='4H')).agg(list).explode('Event')
>>> df = df.reset_index().dropna()
>>> # retrieve time value and convert it to time bins
>>> def time_binning(x):
...     return f'{x.time()} - {(x + pd.offsets.DateOffset(hours=3, minutes=59, seconds=59)).time()}'
...
>>> df['time'] = df['Date and time'].apply(time_binning)
>>> # pivot table
>>> df = df.pivot_table(index='Event', columns='time', aggfunc='count', fill_value=0)['Date and time']
>>> df
time   08:00:00 - 11:59:59  12:00:00 - 15:59:59  16:00:00 - 19:59:59
Event
A                        4                    2                    0
B                        1                    2                    1
C                        0                    1                    0
D                        0                    2                    1

在 pandas 數據幀中按時間間隔計算出現次數

問題描述

2 個解決方案

解決方案1
0 已采納 2021-05-16 10:54:37

解決方案2
0 2021-05-16 11:23:57

在 pandas 數據幀中按時間間隔計算出現次數

問題描述

2 個解決方案

解決方案1 0 已采納 2021-05-16 10:54:37

解決方案2 0 2021-05-16 11:23:57

解決方案1
0 已采納 2021-05-16 10:54:37

解決方案2
0 2021-05-16 11:23:57