[英]How to extract hours from datetime in a pyspark dataframe?
I have a pyspark
dataframe like the following:我有一个pyspark
dataframe 如下所示:
df.show(5)
+----------+
| t_start|
+----------+
|1506125172|
|1506488793|
|1506242331|
|1506307472|
|1505613973|
+----------+
I would like to get the hour and the day of each unix timestamp.我想获取每个 unix 时间戳的小时和日期。 This what I am doing:这是我在做什么:
df = df.withColumn("datetime", F.from_unixtime("t_start", "dd/MM/yyyy HH:mm:ss"))
df = df.withColumn("hour", F.date_trunc('hour',F.to_timestamp("datetime","yyyy-MM-dd HH:mm:ss")))
df.show(5)
+----------+-------------------+----+
| t_start| datetime|hour|
+----------+-------------------+----+
|1506125172|23/09/2017 00:06:12|null|
|1506488793|27/09/2017 05:06:33|null|
|1506242331|24/09/2017 08:38:51|null|
|1506307472|25/09/2017 02:44:32|null|
|1505613973|17/09/2017 02:06:13|null|
+----------+-------------------+----+
And I got null
in the column hour
我在专栏hour
中得到了null
You can use the hour()
function to extract the hour unit from a timestamp column.您可以使用hour()
function 从时间戳列中提取小时单位。 (Also, change your date format. It is in dd/MM/yyyy
) (另外,更改您的日期格式。它在dd/MM/yyyy
中)
from pyspark.sql import functions as F
from pyspark.sql.functions import *
df.withColumn("hour", hour(F.to_timestamp("datetime","dd/MM/yyyy HH:mm:ss"))).show()
+----------+-------------------+----+
| t_start| datetime|hour|
+----------+-------------------+----+
|1506125172|23/09/2017 00:06:12| 0|
|1506488793|27/09/2017 05:06:33| 5|
|1506242331|24/09/2017 08:38:51| 8|
|1506307472|25/09/2017 02:44:32| 2|
|1505613973|17/09/2017 02:06:13| 2|
+----------+-------------------+----+
You can use the hour
function with from_unixtime
simply.您可以简单地将hour
function 与from_unixtime
一起使用。
from pyspark.sql.functions import *
df.withColumn('hour', hour(from_unixtime('t_start'))).show()
+----------+----+
| t_start|hour|
+----------+----+
|1506125172| 0|
|1506488793| 5|
|1506242331| 8|
|1506307472| 2|
|1505613973| 2|
+----------+----+
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.