迭代 Pandas dataframe 上的唯一日期

Question

I have a pandas dataframe like this我有一个像这样的 pandas dataframe

id        date      time    dif
01  2020-04-02  09:44:00
02  2020-04-02  09:50:23
03  2020-04-02  09:54:56
04  2020-04-03  10:24:42
05  2020-04-03  10:32:12
06  2020-04-03  11:12:21
...

What I'm tryng to do is calculate time difference, in minutes, between each row and its previous one per day.我要做的是计算每一行与每天前一行之间的时间差（以分钟为单位）。 So the result should be like this所以结果应该是这样的

id        date      time    dif
01  2020-04-02  09:44:00      6
02  2020-04-02  09:50:23      4
03  2020-04-02  09:54:56
04  2020-04-03  10:24:42      7
05  2020-04-03  10:32:12     40
06  2020-04-03  11:12:21
...

My first thought was to create a list with the unique values of the column date and tried this:我的第一个想法是创建一个包含日期列唯一值的列表并尝试了这个：

import pandas a dp
import numpy as np

...

dates = df.date.unique()

for d in dates:
  df['dif'] = round(df['time'].diff(-1).dt.total_seconds().div(60),0) * -1

But I think it isn't so easy...但我认为这并不容易...

Answer 1

Use DataFrameGroupBy.diff with Series.dt.total_seconds and Series.round :将DataFrameGroupBy.diff与Series.dt.total_seconds和Series.round一起使用：

df['time'] = pd.to_timedelta(df['time'])

df['dif'] = df.groupby('date')['time'].diff(-1).dt.total_seconds().div(60).round().mul(-1)

Or use DataFrameGroupBy.shift with subtracting:或使用DataFrameGroupBy.shift减去：

df['dif'] = (df.groupby('date')['time'].shift(-1)
               .sub(df['time'])
               .dt.total_seconds()
               .div(60)
               .round())
print (df)
   id        date     time   dif
0   1  2020-04-02 09:44:00   6.0
1   2  2020-04-02 09:50:23   5.0
2   3  2020-04-02 09:54:56   NaN
3   4  2020-04-03 10:24:42   8.0
4   5  2020-04-03 10:32:12  40.0
5   6  2020-04-03 11:12:21   NaN

迭代 Pandas dataframe 上的唯一日期

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-07-09 09:03:29

迭代 Pandas dataframe 上的唯一日期

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-07-09 09:03:29

解决方案1
1 已采纳 2020-07-09 09:03:29