简体   繁体   English

使用 Pandas 计算一个值超过限制的总时间

[英]Calculate the total time that a value goes over the limit with Pandas

Given a column of datetime, a column of value and a limit, I have to calculate the total amount of time that goes over the limit.给定一列日期时间、一列值和一个限制,我必须计算超过限制的总时间。 Sample data looks like this: let's say limit is 30示例数据如下所示:假设限制为 30

Time                    Value
2018-01-03 12:54:23     23
2018-01-03 12:58:46     31
2018-01-03 13:02:12     32
2018-01-03 13:04:13     24
2018-01-03 13:07:01     28

My idea is first to use shift function to calculate the time differences between each timestamp.我的想法是首先使用 shift function 来计算每个时间戳之间的时间差。 Then use a for loop to goes over the value.然后使用 for 循环遍历该值。 If the previous and current value are both over the limit, then we increase the total time by the time differences如果前一个值和当前值都超过限制,那么我们将总时间增加时间差

temp["TimeDifference"] = (temp.Time -temp.Time.shift(1)).fillna(pd.Timedelta(seconds=0))

total_time = pd.Timedelta(seconds=0)

for i in range(1, temp.shape[0]):
    if (temp.loc[i - 1].Value > upper_limit) and (temp.loc[i].Value > upper_limit):
        total_time = total_time + temp.loc[i].TimeDifference

It works... but the run time is really long and I know that this algorithm isn't efficient.它可以工作......但是运行时间真的很长,我知道这个算法效率不高。 Can someone give my an advise?有人可以给我一个建议吗? Thanks谢谢

You cite the shift function, but then you didn't use it.您引用了shift function,但后来您没有使用它。 Instead, you hard-coded the -1 shift and wrote your own loop to walk through the rows.相反,您硬编码了-1班次并编写了自己的循环来遍历行。

Instead, do something like this, to use the built-in vectorization.相反,做这样的事情,使用内置的矢量化。

sum((temp.Time - temp.Time.shift(1)) if temp.Value > 30)

... but I need someone to check my syntax; ...但我需要有人检查我的语法; this is top-of-the-head coding between urgent tasks.这是紧急任务之间的顶级编码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM