简体   繁体   中英

How to extract hours from datetime in a pyspark dataframe?

I have a pyspark dataframe like the following:

df.show(5)

+----------+
|   t_start|
+----------+
|1506125172|
|1506488793|
|1506242331|
|1506307472|
|1505613973|
+----------+

I would like to get the hour and the day of each unix timestamp. This what I am doing:

df = df.withColumn("datetime", F.from_unixtime("t_start", "dd/MM/yyyy HH:mm:ss"))
df = df.withColumn("hour", F.date_trunc('hour',F.to_timestamp("datetime","yyyy-MM-dd HH:mm:ss")))
df.show(5)

+----------+-------------------+----+
|   t_start|           datetime|hour|
+----------+-------------------+----+
|1506125172|23/09/2017 00:06:12|null|
|1506488793|27/09/2017 05:06:33|null|
|1506242331|24/09/2017 08:38:51|null|
|1506307472|25/09/2017 02:44:32|null|
|1505613973|17/09/2017 02:06:13|null|
+----------+-------------------+----+

And I got null in the column hour

You can use the hour() function to extract the hour unit from a timestamp column. (Also, change your date format. It is in dd/MM/yyyy )

from pyspark.sql import functions as F
from pyspark.sql.functions import *

df.withColumn("hour", hour(F.to_timestamp("datetime","dd/MM/yyyy HH:mm:ss"))).show()
+----------+-------------------+----+
|   t_start|           datetime|hour|
+----------+-------------------+----+
|1506125172|23/09/2017 00:06:12|   0|
|1506488793|27/09/2017 05:06:33|   5|
|1506242331|24/09/2017 08:38:51|   8|
|1506307472|25/09/2017 02:44:32|   2|
|1505613973|17/09/2017 02:06:13|   2|
+----------+-------------------+----+

You can use the hour function with from_unixtime simply.

from pyspark.sql.functions import *
df.withColumn('hour', hour(from_unixtime('t_start'))).show()

+----------+----+
|   t_start|hour|
+----------+----+
|1506125172|   0|
|1506488793|   5|
|1506242331|   8|
|1506307472|   2|
|1505613973|   2|
+----------+----+

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM