简体   繁体   English

将字符串格式的不同格式的日期解析为日期格式pyspark when子句

[英]Parse different formats of date in string format to date format pyspark when clause

Need to convert this format to需要将此格式转换为

name,code,DATE_invoice
Ram,E01,09/29/2018
Mara,E02,07/14/2017
Test,E03,01/01/18

this:这:

name,code,DATE_invoice
Ram,E01,2018-09-29
Mara,E02,2017-07-14
Test,E03,2018-01-01

If the column is already a date, this should do the job:如果该列已经是日期,这应该可以完成工作:

df = df.withColumn('DATE_invoice', date_format(col("DATE_invoice"), "yyyy-MM-dd")))

To Parse different data formats you can utilise to_date along with coalesce要解析不同的数据格式,您可以使用to_datecoalesce

You can utilise the same approach towards multiple patterns within your dataset, and example can be found here您可以对数据集中的多个模式使用相同的方法,可以在 此处找到示例

Data preparation数据准备

input_str = """
Ram,E01,09/29/2018,
Mara,E02,07/14/2017,
Test,E03,01/01/18
""".split(",")

input_values = list(map(lambda x: x.strip() if x.strip() != 'null' else None, input_str))

cols = list(map(lambda x: x.strip() if x.strip() != 'null' else None, "name,code,DATE_invoice".split(",")))
            
n = len(input_values)
n_col = 3

input_list = [tuple(input_values[i:i+n_col]) for i in range(0,n,n_col)]

sparkDF = sql.createDataFrame(input_list, cols)

sparkDF.show()

+----+----+------------+
|name|code|DATE_invoice|
+----+----+------------+
| Ram| E01|  09/29/2018|
|Mara| E02|  07/14/2017|
|Test| E03|    01/01/18|
+----+----+------------+

To Date and Coalesce迄今为止和合并

sql.sql("set spark.sql.legacy.timeParserPolicy=LEGACY")

sparkDF.withColumn('p1',F.to_date(F.col('DATE_invoice'),"MM/dd/yyyy"))\
       .withColumn('p2',F.to_date(F.col('DATE_invoice'),"MM/dd/yy"))\
       .withColumn('DATE_invoice_parsed',F.coalesce(F.col('p1'),F.col('p2')))\
       .drop(*['p1','p2'])\
       .show(truncate=False)

+----+----+------------+-------------------+
|name|code|DATE_invoice|DATE_invoice_parsed|
+----+----+------------+-------------------+
|Ram |E01 |09/29/2018  |2018-09-29         |
|Mara|E02 |07/14/2017  |2017-07-14         |
|Test|E03 |01/01/18    |0018-01-01         |
+----+----+------------+-------------------+

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM