简体   繁体   中英

Pyspark -Convert String to TimeStamp - Getting Nulls

I've the following column as string on a dataframe df:

date|
+----------------+
|4/23/2019 23:59|
|05/06/2019 23:59|
|4/16/2019 19:00

I am trying to convert this to Timestamp but I only getting NULL values.

My statement is:

from pyspark.sql.functions import col, unix_timestamp
df.withColumn('date',unix_timestamp(df['date'], "MM/dd/yyyy hh:mm").cast("timestamp"))

Why I am getting only Null values? Is It because the Month format (since I hive an additional 0 on 05)?

Thanks!

The pattern for 24 hour format is HH , hh is for am./pm. https://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html

df \
    .withColumn('converted_date', psf.to_timestamp('date', format='MM/dd/yyyy HH:mm')) \
    .show()
        +----------------+-------------------+
        |            date|     converted_date|
        +----------------+-------------------+
        | 4/23/2019 23:59|2019-04-23 23:59:00|
        |05/06/2019 23:59|2019-05-06 23:59:00|
        | 4/16/2019 19:00|2019-04-16 19:00:00|
        +----------------+-------------------+

Whether there is or not a leading 0 does not matter

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM