简体   繁体   中英

Unable to format timestamp in pyspark

I have a CSV data like the below:

time_value,annual_salary
5/01/2019 1:02:16,120.56
06/01/2019 2:02:17,12800
7/01/2019 03:02:18,123.00
08/01/2019 4:02:19,123isdhad  

Now, I want to convert to the timestamp column. So, I created a view out of these records and tried to convert it but it throws an error:

spark.sql("select to_timestamp(time_value,'M/dd/yyyy H:mm:ss') as time_value from table")  

Error :

Text '5/1/2019 1:02:16' could not be parsed

According to the error that I am seeing there, this is concerning the Date Format issue.

Text '5/1/2019 1:02:16' could not be parsed

But your time format is specified as such

'M/dd/yyyy H:mm:ss'

You can see that the day-specific is /1/ but your format is dd which expects two digits.

Please try the following format:

'M/d/yyyy H:mm:ss'

I tried your SQL no problem. It may be a problem with the spark version. I used 2.4.8

截图在这里

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM