如何在熊貓的時間窗口中計算事件頻率

Question

可以說，我在Pandas有一個數據框，描述了在一定時間段內不同商店生產的水果的銷售情況：

    Time_of_sale    Product     Store
05.01.2018 15:37    Apple        1
05.01.2018 13:58    Apple        1
05.01.2018 15:36    Banana       2
05.01.2018 15:33    Banana       3
15.08.2017 19:08    Strawberry   4
15.08.2017 19:04    Blueberry    4
03.09.2017 15:32    Pere         5
03.09.2017 15:31    Pere         6
05.01.2018 15:32    Blueberry    7
05.01.2018 15:27    Banana       2
08.01.2018 09:31    Grapes       1

我想添加到每一行的基本上是在一個時間范圍內（例如3個小時）在該商店中該產品的銷售額。

例如在第一行：

3小時內在商店1中售出了多少蘋果？

因此，結果應添加一個新列（因此不能進行下采樣）。

Time_of_sale           Product         Store       Sales_in_TF
05.01.2018 15:37    Apple           1                2
05.01.2018 13:58    Apple           1                2
05.01.2018 15:36    Banana          2                2
05.01.2018 15:33    Banana          3                1
15.08.2017 19:08    Strawberry      4                1
15.08.2017 19:04    Blueberry       4                1
03.09.2017 15:32    Pere            5                1
03.09.2017 15:31    Pere            6                1
05.01.2018 15:32    Blueberry       7                1
05.01.2018 15:27    Banana          2                2
08.01.2018 09:31    Grapes          1                1

我正在調查

series.resample('3H', label='right').count()

以及

df.groupby(pd.Grouper(freq='3H', closed='left'))

但是我真的找不到我想要的東西。
也許你們有個主意？

Answer 1

我創建了一個全部為1 s的虛擬變量，因此將對其進行計數。

df['Amount'] = 1
groups = df.groupby((
    pd.Grouper(key='Time_of_sale', freq='6H'),
    'Product', 
    'Store'
))
groups.count()

結果：

                                      Amount
Time_of_sale        Product    Store        
2017-03-09 12:00:00 Pere       5           1
                               6           1
2017-08-15 18:00:00 Blueberry  4           1
                    Strawberry 4           1
2018-05-01 12:00:00 Apple      1           2
                    Banana     2           2
                               3           1
                    Blueberry  7           1
2018-08-01 06:00:00 Grapes     1           1

編輯：哎呀沒有看到您想要下采樣。 這不是很優雅，但是您可以執行以下操作：

dfs = []
for i, group in df.groupby((pd.Grouper(key='Time_of_sale', freq='6H'), 'Product', 'Store')):
    group['Amount'] = group.shape[0]
    dfs.append(group)
pd.concat(dfs)

不會降低采樣率。

       Time_of_sale     Product  Store  Amount
2017-03-09 15:32:00  Pere            5       1
2017-03-09 15:31:00  Pere            6       1
2017-08-15 19:04:00  Blueberry       4       1
2017-08-15 19:08:00  Strawberry      4       1
2018-05-01 15:37:00  Apple           1       2
2018-05-01 13:58:00  Apple           1       2
2018-05-01 15:36:00  Banana          2       2
2018-05-01 15:27:00  Banana          2       2
2018-05-01 15:33:00  Banana          3       1
2018-05-01 15:32:00  Blueberry       7       1
2018-08-01 09:31:00  Grapes          1       1

如何在熊貓的時間窗口中計算事件頻率

問題描述

1 個解決方案

解決方案1
0 2018-03-22 08:03:17

如何在熊貓的時間窗口中計算事件頻率

問題描述

1 個解決方案

解決方案1 0 2018-03-22 08:03:17

解決方案1
0 2018-03-22 08:03:17