简体   繁体   中英

pyspark: change string to timestamp

I've a column in String format, some rows are also null. I add random timestamp to make it in the following form to convert it into timestamp.

date
null
22-04-2020
date
01-01-1990 23:59:59.000
22-04-2020 23:59:59.000

df = df.withColumn('date', F.concat (df.date, F.lit(" 23:59:59.000")))
df = df.withColumn('date', F.when(F.col('date').isNull(), '01-01-1990 23:59:59.000').otherwise(F.col('date')))

df.withColumn("date",      F.to_timestamp(F.col("date"),"MM-dd-yyyy HH mm ss SSS")).show(2)

but after this the column date becomes null.

can anyone help me solve this. either convert the string to timestamp direct

Your timestamp format should start with dd-MM , not MM-dd , and you're also missing some colons and dots in the time part. Try the code below:

df.withColumn("date", F.to_timestamp(F.col("date"),"dd-MM-yyyy HH:mm:ss.SSS")).show()
+-------------------+
|               date|
+-------------------+
|1990-01-01 23:59:59|
|2020-04-22 23:59:59|
+-------------------+

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM