Python 和 Pandas：减去和格式化数据框

Question

我想在 python 2.x 中减去两个数据帧并将结果格式化为 hh:mm:ss。 我的问题是我假设 delta 列是一个字符串并且它是一个数字。 我需要帮助，因为我正在努力让它发挥作用。 我已经搜索并尝试了在其他帖子中找到的一些解决方案，但我无法解决。

actual= ...select now()

这是 df

        begin                         actual
0  2018-01-31 16:45:04.263      2018-01-31 16:48:06
1  2018-01-31 16:10:26.000      2018-01-31 16:50:06

现在：

df['actual'] = pd.to_datetime(df['actual'])
df['delta'] = df['actual'] - df['begin'] 
df['delta'] = df['delta'].apply(lambda x: str(x)[-8:])

结果是这样的： 39:49 和 2.737000 。 对于第二个，我想要与第一个相同的格式。 我试过像这样改变函数：

df['delta'] = df['delta'].apply(lambda x: pd.Timedelta(seconds=int(x.total_seconds())))

但它返回：

AttributeError: 'Timestamp' object has no attribute 'total_seconds'

任何想法将不胜感激。

Answer 1

我认为你需要：

print (df.dtypes)
begin     datetime64[ns]
actual    datetime64[ns]
dtype: object


df['delta'] = (df['actual'] - df['begin']).dt.total_seconds()
print (df)
                    begin              actual     delta
0 2018-01-31 16:45:04.263 2018-01-31 16:48:06   181.737
1 2018-01-31 16:10:26.000 2018-01-31 16:50:06  2380.000

如果想要格式化是可能的，但有点疯狂（不是通用的解决方案，因为天数已被删除）：

df['delta'] = (df['actual'] - df['begin']).astype(str).str[7:15]
print (df)
                    begin              actual     delta
0 2018-01-31 16:45:04.263 2018-01-31 16:48:06  00:03:01
1 2018-01-31 16:10:26.000 2018-01-31 16:50:06  00:39:40

df['delta'] = (df['actual'] - df['begin']).astype(str)
print (df)
                    begin              actual                      delta
0 2018-01-31 16:45:04.263 2018-01-31 16:48:06  0 days 00:03:01.737000000
1 2018-01-31 16:10:26.000 2018-01-31 16:50:06  0 days 00:39:40.000000000

Python 和 Pandas：减去和格式化数据框

问题描述

1 个解决方案

解决方案1
1 已采纳 2018-02-09 12:48:09

Python 和 Pandas：减去和格式化数据框

问题描述

1 个解决方案

解决方案1 1 已采纳 2018-02-09 12:48:09

解决方案1
1 已采纳 2018-02-09 12:48:09