[英]How to create datetime columns in a pyspark dataframe?
I have a pyspark dataframe that looks like the following我有一个 pyspark 数据框,如下所示
df
year month day
2017 9 3
2015 5 16
I would like to create a column as datetime
like the following我想创建一个列作为datetime
,如下所示
df
year month day date
2017 9 3 2017-09-03 00:00:00
2015 5 16 2017-05-16 00:00:00
You can use concat_ws
to concat and convert to date
using to_date
您可以使用concat_ws
来连接并使用to_date
转换为date
from pyspark.sql.functions import *
df = spark.createDataFrame([[2017,9,3 ],[2015,5,16]],['year', 'month','date'])
df = df.withColumn('timestamp',to_date(concat_ws('-', df.year, df.month,df.date)))
df.show()
+----+-----+----+----------+
|year|month|date| timestamp|
+----+-----+----+----------+
|2017| 9| 3|2017-09-03|
|2015| 5| 16|2015-05-16|
+----+-----+----+----------+
Schema:架构:
df.printSchema()
root
|-- year: long (nullable = true)
|-- month: long (nullable = true)
|-- date: long (nullable = true)
|-- timestamp: date (nullable = true)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.