简体   繁体   English

如何从熊猫系列或数据框计算最大15分钟的总和

[英]How to calculate the maximum 15-min sum from a Pandas Series or Dataframe

Pandas newbie here. 熊猫新手在这里。 I have a dataset which contains traffic counts with time stamps. 我有一个数据集,其中包含带有时间戳的流量计数。 I want to know which 15-min interval has the most cumulative sum of counts, and the value of this sum. 我想知道哪个15分钟间隔的计数总和最大,以及该总和的值。

Data might look something like this: 数据可能看起来像这样:

import random 
ts = pd.Series(range(1000),index=random.sample(pd.date_range('2015-02-01 06:00:00',periods=3000,freq='1min'),1000)).sort_index()

2015-02-01 06:06:00    314
2015-02-01 06:08:00    154
2015-02-01 06:09:00    914
2015-02-01 06:13:00     84
2015-02-01 06:18:00    880
2015-02-01 06:22:00    912
2015-02-01 06:28:00    410
2015-02-01 06:32:00    391
2015-02-01 06:34:00    270
2015-02-01 06:35:00    984
2015-02-01 06:36:00    271
2015-02-01 06:37:00    722
2015-02-01 06:38:00    748
2015-02-01 06:40:00    313
2015-02-01 06:42:00    277
2015-02-01 06:43:00    604
2015-02-01 06:49:00    888
2015-02-01 06:50:00    943
2015-02-01 06:51:00    124
2015-02-01 06:52:00    806

Is there a way to do this in Pandas? 熊猫有办法做到这一点吗?

a simple solution without using pandas native functions 一个不使用熊猫本机函数的简单解决方案

from datetime import timedelta
start = ts.index[0]
end = ts.index[len(ts)-1]
dur = timedelta(minutes=15)
max_val = 0
while start < end:
    cum_sum = ts[start : start+dur].sum()
    if cum_sum > max_val:
        max_val = cum_sum
        max_seg = (start, start+dur)
    start = star+dur 
print max_val
print max_seg

This is what I came up with: 这是我想出的:

def find_peak_15_minutes(data_frame, column):

    max_sum = 0
    start_of_max15 = 0
    for start in data_frame[column].values:
        series_sum = data_frame[column][data_frame[column].between(start, start + 15)].count()
        if series_sum > max_sum:
            max_sum = series_sum
            start_of_max15 = start
    return (start_of_max15, max_sum)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何计算每分钟数据集的 15 分钟标准偏差? - How to compute 15-min standard deviations of a minutely dataset? 如何计算熊猫系列的最小和最大频率 - How to calculate the minimum and maximum frequency from a series with pandas 如何从熊猫数据框中获取最后 15 个值的总和 - How to get the sum of the last 15 values from a pandas dataframe 在Django中两个小时之间创建15分钟的广告位 - Create 15-min slots between two hours in Django 在 Python 中使用 pandas dataframe 中的时间序列数据,如何计算具有相同日期的列的总和 - Using time series data in a pandas dataframe in Python, how can I calculate a sum for columns that have the same date 如何在不使用最小/最大/总和或平均值的情况下将 dataframe 的日期时间值分配给下一个 15 分钟时间步长? - How to asign Datetime values of a dataframe to the next 15min Timestep without using min/max/sum or mean? Python:找到一个系列的最大值(从 pandas 数据帧产生) - Python: find the maximum of a series (produced from a pandas dataframe) 如何反向计算pandas列表中的最大和 - How to calculate the maximum sum in a reverse way in pandas list 如何使用熊猫数据框计算时间序列中的每月变化 - How to calculate monthly changes in a time series using pandas dataframe 系列中的熊猫数据框 - Pandas dataframe from series of series
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM