简体   繁体   English

如何从 pyspark dataframe 中的日期时间中提取小时数?

[英]How to extract hours from datetime in a pyspark dataframe?

I have a pyspark dataframe like the following:我有一个pyspark dataframe 如下所示:

df.show(5)

+----------+
|   t_start|
+----------+
|1506125172|
|1506488793|
|1506242331|
|1506307472|
|1505613973|
+----------+

I would like to get the hour and the day of each unix timestamp.我想获取每个 unix 时间戳的小时和日期。 This what I am doing:这是我在做什么:

df = df.withColumn("datetime", F.from_unixtime("t_start", "dd/MM/yyyy HH:mm:ss"))
df = df.withColumn("hour", F.date_trunc('hour',F.to_timestamp("datetime","yyyy-MM-dd HH:mm:ss")))
df.show(5)

+----------+-------------------+----+
|   t_start|           datetime|hour|
+----------+-------------------+----+
|1506125172|23/09/2017 00:06:12|null|
|1506488793|27/09/2017 05:06:33|null|
|1506242331|24/09/2017 08:38:51|null|
|1506307472|25/09/2017 02:44:32|null|
|1505613973|17/09/2017 02:06:13|null|
+----------+-------------------+----+

And I got null in the column hour我在专栏hour中得到了null

You can use the hour() function to extract the hour unit from a timestamp column.您可以使用hour() function 从时间戳列中提取小时单位。 (Also, change your date format. It is in dd/MM/yyyy ) (另外,更改您的日期格式。它在dd/MM/yyyy中)

from pyspark.sql import functions as F
from pyspark.sql.functions import *

df.withColumn("hour", hour(F.to_timestamp("datetime","dd/MM/yyyy HH:mm:ss"))).show()
+----------+-------------------+----+
|   t_start|           datetime|hour|
+----------+-------------------+----+
|1506125172|23/09/2017 00:06:12|   0|
|1506488793|27/09/2017 05:06:33|   5|
|1506242331|24/09/2017 08:38:51|   8|
|1506307472|25/09/2017 02:44:32|   2|
|1505613973|17/09/2017 02:06:13|   2|
+----------+-------------------+----+

You can use the hour function with from_unixtime simply.您可以简单地将hour function 与from_unixtime一起使用。

from pyspark.sql.functions import *
df.withColumn('hour', hour(from_unixtime('t_start'))).show()

+----------+----+
|   t_start|hour|
+----------+----+
|1506125172|   0|
|1506488793|   5|
|1506242331|   8|
|1506307472|   2|
|1505613973|   2|
+----------+----+

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM