[英]Convert string column to date in pyspark
I've a dataframe where the date/time column is of string datatype and looks something like "Tue Apr 21 01:16:19 2020"
.我有一个 dataframe ,其中日期/时间列是字符串数据类型,看起来像"Tue Apr 21 01:16:19 2020"
。 How do I convert this to a date column with format as 2020/04/21
in pyspark.如何将其转换为 pyspark 中格式为2020/04/21
的日期列。 I tried something like this,我尝试过这样的事情,
option1:选项1:
df = df.withColumn("event_time2",from_unixtime(unix_timestamp(col("Event_time"), 'MM/dd/yyy')))
option2:选项2:
df= df.withColumn("event_time2",unix_timestamp(col("Event_time"),'yyyy-MM-dd HH:mm:ss').cast("timestamp"))
but both return null但两者都返回 null
You could use to_date
and date_format
.您可以使用to_date
和date_format
。 EEE
is for day in the week
. EEE
是day in the week
。 Refer to Java Simple Data Format for the complete list完整列表参考Java 简单数据格式
from pyspark.sql import functions as F
df.withColumn("Event_time2", F.to_date("Event_time", 'EEE MMM dd HH:mm:ss yyyy')).show(truncate=False)
#+------------------------+-----------+
#|Event_time |Event_time2|
#+------------------------+-----------+
#|Tue Apr 21 01:16:19 2020|2020-04-21 |
#+------------------------+-----------+
df.withColumn("Event_time2", F.date_format(F.to_date("Event_time", 'EEE MMM dd HH:mm:ss yyyy'),'yyyy/MM/dd')).show(truncate=False)
#+------------------------+-----------+
#|Event_time |Event_time2|
#+------------------------+-----------+
#|Tue Apr 21 01:16:19 2020|2020/04/21 |
#+------------------------+-----------+
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.