简体   繁体   中英

Issue with Spark dataframe loading timestamp data to hive table

I am trying to load a dataframe to hive table. But it is adding additional 30 minutes to the table. I have tried the below

from pyspark import SparkContext,HiveContext

sc = SparkContext()

hive_context = HiveContext(sc)

df_load.write.mode("append").saveAsTable("default.DATA_LOAD")

the df_load has a column "currenthour" with value "2020-09-01 09:00:00". But in the table, it is loaded as "2020-09-01 09:30:00".

How to resolve this issue.

Its a common issue with Timestamp datatype because of the timezone. Refer this:

Spark SQL to Hive table - Datetime Field Hours Bug

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM