[英]How to convert datetime from string format into datetime format in pyspark?
I created a dataframe using sqlContext and I have a problem with the datetime format as it is identified as string. 我使用sqlContext创建了一个数据框,我遇到了日期时间格式的问题,因为它被标识为字符串。
df2 = sqlContext.createDataFrame(i[1])
df2.show
df2.printSchema()
Result: 结果:
2016-07-05T17:42:55.238544+0900
2016-07-05T17:17:38.842567+0900
2016-06-16T19:54:09.546626+0900
2016-07-05T17:27:29.227750+0900
2016-07-05T18:44:12.319332+0900
string (nullable = true)
Since the datetime schema is a string, I want to change it to datetime format as follows: 由于datetime架构是一个字符串,我想将其更改为datetime格式,如下所示:
df3 = df2.withColumn('_1', df2['_1'].cast(datetime()))
Here I got an error: TypeError: Required argument 'year' (pos 1) not found 这里我收到一个错误:TypeError:找不到必需参数'year'(pos 1)
What should I do to solve this problem? 我该怎么做才能解决这个问题?
Try this: 尝试这个:
from pyspark.sql.types import DateType
ndf = df2.withColumn('_1', df2['_1'].cast(DateType()))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.