如何汇总熊猫特定日期时间的数据？

Question

I have a data frame with more then 1mil values. 我有一个超过1mil值的数据框。 The task is, to sum it up these values in the range of every 5 minutes. 任务是在每5分钟范围内将这些值加总。 In other words from 0 to first 5 minutes, then 10 minutes, then 15 and so on. 换句话说，从0到前5分钟，然后是10分钟，然后是15，依此类推。 But there are over 30-33 days. 但是超过30-33天。 This is my data: 这是我的数据：

                                    Size
                        DateTime                              
2018-10-19 04:14:01.015000+00:00     2
2018-10-19 04:14:01.546000+00:00     1
2018-10-19 04:15:01.290000+00:00     1
2018-10-19 04:15:01.291000+00:00    10
2018-10-19 04:15:01.821000+00:00     1
2018-10-19 04:15:01.821000+00:00     1
2018-10-19 04:15:02.352000+00:00     1
2018-10-19 04:15:02.352000+00:00     1
2018-10-19 04:15:02.883000+00:00     1
2018-10-19 04:15:02.884000+00:00     1
2018-10-19 04:15:03.413000+00:00     1
2018-10-19 04:15:03.414000+00:00     1
2018-10-19 04:15:03.943000+00:00     1
2018-10-19 04:15:03.943000+00:00     1
2018-10-19 04:15:04.474000+00:00     1
2018-10-19 04:15:04.474000+00:00     1
2018-10-19 04:15:05.003000+00:00     1
2018-10-19 04:15:05.003000+00:00     1
2018-10-19 04:15:05.334000+00:00     1
2018-10-19 04:15:05.336000+00:00     1
...
2018-11-26 19:59:33.928000+00:00     1
2018-11-26 19:59:37.221000+00:00     1
2018-11-26 19:59:41.808000+00:00     1
2018-11-26 19:59:42.338000+00:00     1
2018-11-26 19:59:45.520000+00:00     1
2018-11-26 19:59:52.059000+00:00     1
2018-11-26 19:59:52.589000+00:00     1
2018-11-26 19:59:54.714000+00:00     1
2018-11-26 19:59:55.244000+00:00     1
2018-11-26 19:59:56.297000+00:00     1
2018-11-26 19:59:57.888000+00:00     1
2018-11-26 19:59:59.008000+00:00     1
2018-11-26 20:00:00.071000+00:00     1
2018-11-26 20:51:04.606000+00:00     1
2018-11-26 20:51:57.307000+00:00     1

As you can see, it's pretty lots of rows in there. 如您所见，其中有很多行。 I have some ideas about how to do it, but I'm stuck. 我对如何执行操作有一些想法，但我遇到了麻烦。 Well, data range could be set like: 好吧，数据范围可以设置为：

data[data.index.minute % 5 == 0]

But how could I sum values before this and in the next range ? 但是，如何在此之前和下一个范围内求和？

Answer 1

随着resample ：

data.resample('5min')['Size'].sum()

Answer 2

Use pd.Grouper() here with freq=5min 在这里使用pd.Grouper() ， freq=5min

note I only used the top rows of your example data, above the .. 请注意，我仅在..上方使用了示例数据的顶部行..

df_sum = df.groupby(pd.Grouper(key='DateTime', freq='5min', axis=1)).Size.sum().reset_index()

print(df_sum)

             DateTime  Size
0 2018-10-19 04:10:00     3
1 2018-10-19 04:15:00    27

如何汇总熊猫特定日期时间的数据？

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-03-14 18:44:42

解决方案2
1 2019-03-14 19:10:52

如何汇总熊猫特定日期时间的数据？

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-03-14 18:44:42

解决方案2 1 2019-03-14 19:10:52

解决方案1
1 已采纳 2019-03-14 18:44:42

解决方案2
1 2019-03-14 19:10:52