简体   繁体   English

如何在不使用数据时间的情况下计算时间差?

[英]How to calculate difference between times without using datatime?

I'm trying to calculate the time difference between two times, I'm really struggling using datetime because I only have a time value, not also a date value and when I write something like t1 = datetime.strptime(item[1], '%H:%M:%S') with, for example, item[1] = "00:01:34" and then print t1 I got something like t1: 1900-01-01 00:01:34 .我试图计算两次之间的时间差,我真的很难使用datetime时间,因为我只有一个时间值,没有日期值,当我写类似t1 = datetime.strptime(item[1], '%H:%M:%S')例如, item[1] = "00:01:34"然后打印t1我得到了类似t1: 1900-01-01 00:01:34的东西。

When I calculate timedelta = t2 - t1 and then I try to print the results, it doesn't print anything because it goes into a loop.当我计算timedelta = t2 - t1然后我尝试打印结果时,它不会打印任何内容,因为它进入了一个循环。 I already tried using time / .time() but it gives me a TypeError: unsupported operand type(s) for -: 'builtin_function_or_method' and 'builtin_function_or_method' .我已经尝试过使用time / .time()但它给了我一个TypeError: unsupported operand type(s) for -: 'builtin_function_or_method' and 'builtin_function_or_method' If I point on timedelta = (t2 - t1).seconds it reports Class 'time' does not define 'sub', so the '-' operator cannot be used on its instances and if I run the code it gives me the TypeError even if I use .time() .如果我指向timedelta = (t2 - t1).seconds ,它会报告Class 'time' does not define 'sub', so the '-' operator cannot be used on its instances ,如果我运行代码,它甚至会给我TypeError如果我使用.time()

For clarity, my code is this:为了清楚起见,我的代码是这样的:

listaDivisa = [
['2010-01-05', '12:32:05', 'at the kitchen entrance from the dining room', 'ON']
['2010-01-05', '12:32:05', 'in the kitchen', 'ON']
['2010-01-05', '12:32:08', 'in the living room', 'ON']
['2010-01-05', '12:32:08', 'in the kitchen', 'OFF']
['2010-01-05', '12:32:10', 'at the kitchen entrance from the dining room', 'OFF']
['2010-01-05', '12:32:10', 'in the kitchen', 'ON']
['2010-01-05', '12:32:11', 'in the kitchen', 'OFF']
['2010-01-05', '12:32:11', 'in the living room', 'OFF']
['2010-01-06', '02:32:11', 'in the kitchen', 'ON']
['2010-01-06', '02:32:20', 'in the kitchen', 'OFF']
['2010-01-06', '02:34:23', 'in the living room', 'ON']
['2010-01-06', '02:34:42', 'in the living room', 'OFF']]
# this list contains approximately 3000 of this activities, obviously I put only 
# a few just for example

listaDict = {}

for p in listaDivisa:
    if p[2] not in listaDict.keys():
        listaDict[p[2]] = dict()

for i, item in enumerate(listaDivisa):
    for j in range(i + 1, len(listaDivisa) - 1):
        if item[0] == listaDivisa[j][0]:
            if item[2] == listaDivisa[j][2]:
                if item[3] == "ON" and listaDivisa[j][3] == "OFF":
                    t1 = datetime.strptime(item[1], '%H:%M:%S')
                    t2 = datetime.strptime(listaDivisa[j][1], '%H:%M:%S')
                    timedelta = (t2 - t1).seconds

                    listaDict[item[2]][item[
                        0]] = "active for " + str(
                        timedelta) + " seconds"

for key, value in listaDict.items():
    print(key, ' : ', value)

The result I'm trying to achive is a nested dictionary like this:我试图达到的结果是这样的嵌套字典:

in the kitchen:['2009-10-16': 'active for 2341 seconds', '2009-10-17': 'active for 0 seconds' ..... ]
in the living room:['2009-10-16': 'active for 20 seconds', '2009-10-17': 'active for 3 seconds' ..... ]
at the kitchen entrance from the dining room:['2009-10-16': 'active for 6473 seconds', '2009-10-17': 'active for 374 seconds' ... ]

Here's a solution to your problem using pandas .这是使用pandas解决您的问题的方法。 I trust other users will find ways to refactor some of the below to improve it, but I believe it does the job.我相信其他用户会想办法重构下面的一些内容来改进它,但我相信它可以完成工作。 I've included comments to explain what is happening at each stage.我已经包含评论来解释每个阶段发生的事情。 Let me know if anything is unclear, or if the result is not what you intended.如果有任何不清楚的地方,或者结果不是你想要的,请告诉我。

import pandas as pd

listaDivisa = [
['2010-01-05', '12:32:05', 'at the kitchen entrance from the dining room', 'ON'],
['2010-01-05', '12:32:05', 'in the kitchen', 'ON'],
['2010-01-05', '12:32:08', 'in the living room', 'ON'],
['2010-01-05', '12:32:08', 'in the kitchen', 'OFF'],
['2010-01-05', '12:32:10', 'at the kitchen entrance from the dining room', 'OFF'],
['2010-01-05', '12:32:10', 'in the kitchen', 'ON'],
['2010-01-05', '12:32:11', 'in the kitchen', 'OFF'],
['2010-01-05', '12:32:11', 'in the living room', 'OFF'],
['2010-01-06', '02:32:11', 'in the kitchen', 'ON'],
['2010-01-06', '02:32:20', 'in the kitchen', 'OFF'],
['2010-01-06', '02:34:23', 'in the living room', 'ON'],
['2010-01-06', '02:34:42', 'in the living room', 'OFF']]

# 1) list of lists to df
df = pd.DataFrame(listaDivisa, columns=['Date','Time','Activity','Status'])

# 2) add datetime col
df['Datetime'] = pd.to_datetime(df.Date + ' ' + df.Time)

# 3) groupby date & activity, calc diff between datetime vals (= timedeltas) 
# and convert those timedeltas to seconds
df['Timedelta'] = df.groupby(['Date','Activity'])['Datetime'].transform('diff').dt.total_seconds()

# 4) create new df just on df.Status == 'OFF', these will be the correct timedeltas
new_df = df.loc[df.Status=='OFF',['Date','Activity','Timedelta']]

# =============================================================================
# new_df at this stage:
# 
#           Date                                      Activity  Timedelta
# 3   2010-01-05                                in the kitchen        3.0
# 4   2010-01-05  at the kitchen entrance from the dining room        5.0
# 6   2010-01-05                                in the kitchen        1.0
# 7   2010-01-05                            in the living room        3.0
# 9   2010-01-06                                in the kitchen        9.0
# 11  2010-01-06                            in the living room       19.0
# =============================================================================

# 5) now groupby to get sum per activity & date and reset the index 
# e.g. on '2010-01-05' we have 2 timedeltas for 'in the kitchen': 3 sec + 1 sec = 4

new_df = new_df.groupby(['Activity','Date'])['Timedelta'].sum().reset_index()
  
# =============================================================================
# e.g. new_df now looks like:
#                                        Activity        Date  Timedelta
# 0  at the kitchen entrance from the dining room  2010-01-05        5.0
# 1                                in the kitchen  2010-01-05        4.0
# 2                                in the kitchen  2010-01-06        9.0
# 3                            in the living room  2010-01-05        3.0
# 4                            in the living room  2010-01-06       19.0
# =============================================================================

# 6) turn df.Timedelta into strings
new_df['Timedelta'] = new_df['Timedelta'].apply(lambda x: f'active for {int(x)} seconds')

# 7) set df.Activity as index
new_df.set_index('Activity', drop=True, inplace=True)

# 8) define a dictionary and iter over rows to populate it
a_dict = {}

for row, data in new_df.iterrows():
    # if row already exists in dict, we want to *add* a key:value pair
    if row in a_dict:
        a_dict[row].update({ data[0]: data[1]})
    # else, we want to create a new key, and set its init value to key:value pair
    else:
        a_dict[row] = { data[0]: data[1] } 
    
a_dict

{'at the kitchen entrance from the dining room': {'2010-01-05': 'active for 5 seconds'},
 'in the kitchen': {'2010-01-05': 'active for 4 seconds',
  '2010-01-06': 'active for 9 seconds'},
 'in the living room': {'2010-01-05': 'active for 3 seconds',
  '2010-01-06': 'active for 19 seconds'}}

So, now you can do something like:因此,现在您可以执行以下操作:

a_dict['in the kitchen']['2010-01-05']
'active for 4 seconds'

# As mentioned: we have 2 entries for 'in the kitchen' on '2010-01-05'
# summed to 4 seconds'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM