简体   繁体   中英

Alternative of Select DATE_FORMAT(date, format) in Apache Spark

I am using Apache-Spark SQL and Java to read from parquet file. The file contains a date column( M/d/yyyy ) and I want to change that to some other format( yyyy-dd-MM ). Something like Select DATE_FORMAT(date, format) that we can do in mysql.
Is there any similar method in Apache-Spark?

What you can do is parse the string using to_timestamp with your current schema and format it with the one you desire using date_format :

val df = Seq("1/1/2015", "02/10/2014", "4/30/2010", "03/7/2015").toDF("d")
df.select('d, date_format(to_timestamp('d, "MM/dd/yyyy"), "yyyy-dd-MM") as "new_d")
  .show
+----------+----------+
|         d|     new_d|
+----------+----------+
|  1/1/2015|2015-01-01|
|02/10/2014|2014-10-02|
| 4/30/2010|2010-30-04|
| 03/7/2015|2015-07-03|
+----------+----------+

Note that the parsing is pretty robust and supports single digit days and months.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM