[英]Replace a list of timestamps in a pandas dataframe with a list of our for every timestamp in every row
I have a column in a pandas dataframe that consist of lists containing timestamps. 我在pandas数据框中有一列,其中包含包含时间戳的列表。 I want to replace this list of timestamps with a list of the hour of each timestamp for every row.
我想将此时间戳列表替换为每一行每个时间戳小时的列表。 Below is an example
下面是一个例子
df = pd.DataFrame( {'id':[1,2], 'time':[ [2017-09-05 03:34:51,2016-03-07 05:24:55], [2016-02-06 03:14:21,2014-08-09 09:12:44, 2011-05-02 07:43:21] ] })
I would like a new column named 'hour' where 我想要一个名为“小时”的新列,
df['hour'] = [ [3,5], [3,9,7] ]
I tried different functionalities using map() and apply() but nothing produced the desired outcome, any help is very much appreciated. 我使用map()和apply()尝试了不同的功能,但没有任何方法产生期望的结果,非常感谢您的帮助。
Use apply
+ to_datetime
. 使用
apply
+ to_datetime
。
s = df.time.apply(lambda x: pd.to_datetime(x, errors='coerce').hour.tolist() )
s
0 [3, 5]
1 [3, 9, 7]
Name: time, dtype: object
df['hour'] = s
df
id time hour
0 1 [2017-09-05 03:34:51, 2016-03-07 05:24:55] [3, 5]
1 2 [2016-02-06 03:14:21, 2014-08-09 09:12:44, 201... [3, 9, 7]
Statutory warning, this is inefficient in general, because you have a column of lists. 法定警告,这通常效率低下,因为您有一列列表。
If you want to know how I'd store this data, it'd be something like: 如果您想知道如何存储这些数据,它将类似于:
df
id time
0 1 2017-09-05 03:34:51
1 1 2016-03-07 05:24:55
2 2 2016-02-06 03:14:21
3 2 2014-08-09 09:12:44
4 2 2011-05-02 07:43:21
Now, getting the hour is as easy as: 现在,获取时间很简单:
h = pd.to_datetime(df.time).dt.hour
h
0 3
1 5
2 3
3 9
4 7
Name: time, dtype: int64
df['hour'] = h
If you want to perform group-wise computation, you can always use df.groupby
. 如果要执行逐组计算,则始终可以使用
df.groupby
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.