简体   繁体   English

Apache Spark中Select DATE_FORMAT(date,format)的替代方法

[英]Alternative of Select DATE_FORMAT(date, format) in Apache Spark

I am using Apache-Spark SQL and Java to read from parquet file. 我正在使用Apache-Spark SQL和Java从镶木地板文件中读取。 The file contains a date column( M/d/yyyy ) and I want to change that to some other format( yyyy-dd-MM ). 该文件包含一个日期列( M/d/yyyy ),我想将其更改为其他格式( yyyy-dd-MM )。 Something like Select DATE_FORMAT(date, format) that we can do in mysql. 我们可以在mysql中执行Select DATE_FORMAT(date, format)操作。
Is there any similar method in Apache-Spark? Apache-Spark中有类似的方法吗?

What you can do is parse the string using to_timestamp with your current schema and format it with the one you desire using date_format : 您可以做的是使用to_timestamp与当前架构解析字符串,并使用date_format将其格式化为所需的字符串:

val df = Seq("1/1/2015", "02/10/2014", "4/30/2010", "03/7/2015").toDF("d")
df.select('d, date_format(to_timestamp('d, "MM/dd/yyyy"), "yyyy-dd-MM") as "new_d")
  .show
+----------+----------+
|         d|     new_d|
+----------+----------+
|  1/1/2015|2015-01-01|
|02/10/2014|2014-10-02|
| 4/30/2010|2010-30-04|
| 03/7/2015|2015-07-03|
+----------+----------+

Note that the parsing is pretty robust and supports single digit days and months. 请注意,解析非常健壮,并且支持单位数天和数月。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM