按时间分组 Delta Python Pandas

Question

I am attempting to do a group by in Python.我正在尝试在 Python 中进行分组。 What I have is a data frame that has two columns...Name and Time Difference ( Time Difference ) is a timedelta variable that looks like the following -1 days 14:00:0000, 0 days 00:08:0000, ect.我拥有的是一个有两列的数据框...名称和时差（时差）是一个 timedelta 变量，如下所示 -1 天 14:00:0000、0 天 00:08:0000 等。 Name has duplicates in it...it looks like Brad, Amy, Brad, Brad, Bill, Amy....what I want to do is find the Mean of Time Difference by Name.名称中有重复项...看起来像 Brad、Amy、Brad、Brad、Bill、Amy....我想做的是按名称查找时差的平均值。 Also Time Difference does have NA values in it.时差也确实有 NA 值。

I have tried我努力了

data_frame['NewMean'] = data_frame['TimeDifference'].values.astype(np.int64)

means = data_frame.groupby(data_frame['Name']).mean()

means['NewMean'] = pd.to_timedelta(means['NewMean'])

But I keep getting the error invalid literal for int()但我不断收到 int() 的错误无效文字

I know float fixes this but I want to create a new dataframe with this information that just list out the names ( no dupes ) and the mean of each name我知道 float 可以解决这个问题，但我想创建一个新的 dataframe ，其中包含仅列出名称（没有重复）和每个名称的平均值的信息

Answer 1

Try this:尝试这个：

data_frame['TimeDifference'] = data_frame['TimeDifference'].dt.days
data_frame['mean'] = data_frame.groupby('Name')['TimeDifference'].mean()

Answer 2

There is a way to get the values without casting to int and ignoring nan or nat values but involves a lambda expression, the results are a timedelta objects:有一种方法可以在不强制转换为 int 并忽略nan或nat值的情况下获取值，但涉及 lambda 表达式，结果是 timedelta 对象：

import numpy as np

time_groups = data_frame.groupby('Name').apply(
    lambda df: np.mean(df.TimeDifference)
)

按时间分组 Delta Python Pandas

问题描述

2 个解决方案

解决方案1
0 已采纳 2020-04-21 23:48:11

解决方案2
0 2020-04-22 00:19:26

按时间分组 Delta Python Pandas

问题描述

2 个解决方案

解决方案1 0 已采纳 2020-04-21 23:48:11

解决方案2 0 2020-04-22 00:19:26

解决方案1
0 已采纳 2020-04-21 23:48:11

解决方案2
0 2020-04-22 00:19:26