用每行中每个时间戳的列表替换pandas数据框中的时间戳列表

Question

I have a column in a pandas dataframe that consist of lists containing timestamps. 我在pandas数据框中有一列，其中包含包含时间戳的列表。 I want to replace this list of timestamps with a list of the hour of each timestamp for every row. 我想将此时间戳列表替换为每一行每个时间戳小时的列表。 Below is an example 下面是一个例子

df = pd.DataFrame( {'id':[1,2], 'time':[ [2017-09-05 03:34:51,2016-03-07 05:24:55], [2016-02-06 03:14:21,2014-08-09 09:12:44, 2011-05-02 07:43:21] ] })

I would like a new column named 'hour' where 我想要一个名为“小时”的新列，

df['hour'] = [ [3,5], [3,9,7] ]

I tried different functionalities using map() and apply() but nothing produced the desired outcome, any help is very much appreciated. 我使用map（）和apply（）尝试了不同的功能，但没有任何方法产生期望的结果，非常感谢您的帮助。

Answer 1

Use apply + to_datetime . 使用apply + to_datetime 。

s = df.time.apply(lambda x: pd.to_datetime(x, errors='coerce').hour.tolist() )
s

0       [3, 5]
1    [3, 9, 7]
Name: time, dtype: object

df['hour'] = s
df

   id                                               time       hour
0   1         [2017-09-05 03:34:51, 2016-03-07 05:24:55]     [3, 5]
1   2  [2016-02-06 03:14:21, 2014-08-09 09:12:44, 201...  [3, 9, 7]

Statutory warning, this is inefficient in general, because you have a column of lists. 法定警告，这通常效率低下，因为您有一列列表。

If you want to know how I'd store this data, it'd be something like: 如果您想知道如何存储这些数据，它将类似于：

df

   id                 time
0   1  2017-09-05 03:34:51
1   1  2016-03-07 05:24:55
2   2  2016-02-06 03:14:21
3   2  2014-08-09 09:12:44
4   2  2011-05-02 07:43:21

Now, getting the hour is as easy as: 现在，获取时间很简单：

h = pd.to_datetime(df.time).dt.hour
h

0    3
1    5
2    3
3    9
4    7
Name: time, dtype: int64

df['hour'] = h

If you want to perform group-wise computation, you can always use df.groupby . 如果要执行逐组计算，则始终可以使用df.groupby 。

用每行中每个时间戳的列表替换pandas数据框中的时间戳列表

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-10-21 23:53:12

用每行中每个时间戳的列表替换pandas数据框中的时间戳列表

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-10-21 23:53:12

解决方案1
1 已采纳 2017-10-21 23:53:12