如何在一行中对文本进行分组并计算 python pandas 中的持续时间？

Question

I have a dataframe like this,我有一个像这样的 dataframe，

ID    time       text
1   8:43:43 PM   one day
1   8:43:51 PM   this code
1   8:44:07 PM   will help
1   8:44:17 PM   someone.
2   8:45:56 AM   yes
2   8:46:09 AM   I'm feeling
2   8:46:25 AM   good.

I want to group the time column by ID and calculate the time duration.我想按 ID 对时间列进行分组并计算持续时间。 I know we can use join to concat text and group by each ID.我知道我们可以使用join来连接文本并按每个 ID 分组。

The final output will be,最终的 output 将是，

ID   time-duration    text
1    34        one day this code will help someone.
2    29        yes I'm feeling good.

Answer 1

Use GroupBy.agg with named aggregations (best practice from pandas >= 0.25.0 )将GroupBy.agg与named aggregations一起使用（来自pandas >= 0.25.0的最佳实践）

The advantage of named aggregations is that we aggregate and at the same time rename our column, see time_duration in output.命名聚合的优点是我们聚合并同时重命名我们的列，请参阅time_duration中的 time_duration。

df['time'] = pd.to_datetime(df['time'])

dfg = df.groupby('ID').agg(
    time_duration=('time', lambda x: x.max()-x.min()),
    text=('text', ' '.join)
).reset_index()

   ID time_duration                                  text
0   1      00:00:34  one day this code will help someone.
1   2      00:00:29                 yes I'm feeling good.

Answer 2

We can do我们可以做的

df.groupby('ID').agg({'time':np.ptp,'text':' '.join})
Out[49]:  
       time                                  text
ID                                               
1  00:00:34  one day this code will help someone.
2  00:00:29                 yes I'm feeling good.

Answer 3

Groupby and aggregation: Groupby 和聚合：

(df.groupby('ID', as_index=False)
   .agg({'time': lambda x: (x.max() - x.min()).total_seconds(),
         'text': ' '.join})
)

Output: Output：

   ID  time                                  text
0   1  34.0  one day this code will help someone.
1   2  29.0                 yes I'm feeling good.

如何在一行中对文本进行分组并计算 python pandas 中的持续时间？

问题描述

3 个解决方案

解决方案1
3 2019-11-05 20:40:47

解决方案2
3 已采纳 2019-11-05 20:41:41

解决方案3
3 2019-11-05 20:41:59

如何在一行中对文本进行分组并计算 python pandas 中的持续时间？

问题描述

3 个解决方案

解决方案1 3 2019-11-05 20:40:47

解决方案2 3 已采纳 2019-11-05 20:41:41

解决方案3 3 2019-11-05 20:41:59

解决方案1
3 2019-11-05 20:40:47

解决方案2
3 已采纳 2019-11-05 20:41:41

解决方案3
3 2019-11-05 20:41:59