简体   繁体   中英

Azure Databricks Delta Table modifies the TIMESTAMP format while writing from Spark DataFrame

I am new to Azure Databricks,I am trying to write a dataframe output to a delta table which consists TIMESTAMP column. But strangely it changes the TIMESTAMP pattern after writing to delta table. My DataFrame Output column holds the value in this format 2022-05-13 17:52:09.771 , But After writing it to the Table, The column value is getting populated as

2022-05-13T17:52:09.771+0000

I am using below function to generate this Dataframe output

val pretsUTCText = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")
  val tsUTCText: String =  pretsUTCTextNew.format(ts)
  val tsUTCCol : Column = lit(tsUTCText)
  val df = df2.withColumn(to_timestamp(timestampConverter.tsUTCCol,"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"))

The Dataframe output is returning 2022-05-13 17:52:09.771 as TIMESTAMP pattern. But After writing it to Delta Table I see the same value is getting populated as 2022-05-13T17:52:09.771+0000

Thanks in Advance. I could not find any solution.

I have just found the same behaviour on Databricks as you, and it behaves differently than the Databricks document . It seems after some versions Databricks show timezone as a default so you see additional +0000. I think you can use date_format function when you populate data if you don't want it. Also, I think you don't need 'Z' in format text as it is for timezone. See the screenshot below.

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM