简体   繁体   English

使用padas dataframe中的for,If语句计算持续时间

[英]Using of for, If statements in padas dataframe to calculate duration

I am a newbie to pandas and python - Your answers are highly appreciated I have three columns in a data frame where it has values as follows: print(df)我是 pandas 和 python 的新手 - 非常感谢您的回答 我在数据框中有三列,其值如下: print(df)

    |name        |date&time           |Id
    1 Start      2021-01-1 17:15:56    Bike1
    
    2 Pause      2021-01-1 17:17:57    Bike1

    3 Resume     2021-01-1 17:18:50    Bike1

    3 Progress   2021-01-1 17:19:58    Bike1

    5 Stop       2021-01-1 17:20:00    Bike1

    6 Start      2021-01-1 17:25:56    Bike2
    
    7 Pause      2021-01-1 17:27:57    Bike2

    8 Resume     2021-01-1 17:28:50    Bike2

    9 Progress   2021-01-1 17:29:58    Bike2

   10 Stop       2021-01-1 17:30:00    Bike2

I am trying to get the duration of the total time spent here by excluding the time between pause and resume.我试图通过排除暂停和恢复之间的时间来获得在这里花费的总时间的持续时间。

 If(name=='Start')
a=date&time of that name(start) (storing it in a temp variable)
 If(name=='Pause')
b=date&time of that name(Pause) (storing it in a temp variable)
c=a+b;
 If(name=='resume')
d=date&time of that name(resume) (storing it in a temp variable)
 If(name=='stop')
e=date&time of that name(Stop) (storing it in a temp variable)
f=d+e;
For(time=0)
time=time+(c+f)

thought of this pseudo code - Can anyone help me with this??想到了这个伪代码 - 任何人都可以帮我解决这个问题吗? Thanks in advance.提前致谢。

The beauty of pandas dataframes is that you can use operations by columns and rows without the need to iterate the rows. pandas 数据帧的美妙之处在于您可以按列和行使用操作,而无需迭代行。 You can solve this problem in very few lines of code:你可以用很少的几行代码解决这个问题:

make sure date&time is type datetime:确保日期和时间是日期时间类型:

df['date&time']=pd.to_datetime(df['date&time'])

then pivot so the times are columns for each bike:然后是 pivot 所以时间是每辆自行车的列:

df1=df.pivot(index='Id', columns='name' ,values='date&time')

finally, simple do the math:最后,简单算一下:

df1['totaltime']=df1['Stop']-df1['Start']-(df1['Resume']-df1['Pause'])

output: output:

name    Pause                   Progress                 Resume                  Start                   Stop                  totaltime
Id                      
Bike1   2021-01-01 17:17:57     2021-01-01 17:19:58     2021-01-01 17:18:50     2021-01-01 17:15:56     2021-01-01 17:20:00     0 days 00:03:11
Bike2   2021-01-01 17:27:57     2021-01-01 17:29:58     2021-01-01 17:28:50     2021-01-01 17:25:56     2021-01-01 17:30:00     0 days 00:03:11

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM