简体   繁体   English

将 spark current_timestamp 值推送到带有 timezone 列的 postgres 时间戳

[英]Push spark current_timestamp value to postgres timestamp with timezone column

I need to push 'start_time' value to postgres table which has column with 'timestamp with timezone' datatype.我需要将'start_time'值推送到具有'timestamp with timezone'数据类型的列的postgres表。 I need solution only in java jdbc connection only please.我只需要 java jdbc 连接中的解决方案。

import spark.implicits._ 
val df1 =Seq(("Process Name","Process Description"))
                         .toDF("process_nm","process_desc") 
val df2 = df1.withColumn("start_time",current_timestamp)  
df2.show(false)             
df2.printSchema

df2.collect().foreach(row=>
 {    
     println("Before calling-"+row.getString(0)+"   "+row.getString(1)+"   
                                         "+row.getTimestamp(2))
    
     var process_name:String=row.getString(0)
     var process_description:String=row.getString(1) 
     var start_time=row.getTimestamp(2)
     var insertSql="""insert into test_log(process_nm,start_time,process_desc) 
                       values('$process_name','$start_time','$process_description')""" 

     import com.typesafe.config.ConfigFactory
     import org.apache.spark.sql.SparkSession
     import org.apache.spark.sql.functions.{concat, lit}

     import java.io.File
     import java.sql.{Connection, DriverManager}
 
     var db_conn_string = "jdbc:" + db_type + "://" + db_host + ":" + db_port + "/" + db_database
     val direct_conn = DriverManager.getConnection(db_conn_string, db_user, db_pass)
     val statement = direct_conn.createStatement()
     val result=statement.executeUpdate(insertSql)
     println("inserted-"+result)

     println("insert_sql-"+insertSql)  
})  

Getting below error while pushing start_time to postgres table将 start_time 推送到 postgres 表时出现以下错误

Before calling-Process Name   Process Description   2021-12-06 05:51:12.278559
org.postgresql.util.PSQLException: ERROR: invalid input syntax for type timestamp with time zone: "$start_time"
  Position: 93
  at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2552)
  at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2284)
  at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:322)
  at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:481)
  at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:401)
  at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:322)
  at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:308)
  at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:284)
  at org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:258)
  at $anonfun$res19$1(<pastie>:57)
  at $anonfun$res19$1$adapted(<pastie>:32)
  at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
  at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
  at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)

Use:利用:

 var insertSql="insert into test_log(process_nm,start_time,process_desc) 
                       values('$process_name','$start_time','$process_description')"

You don't want to use """ as it doesn't allow variable substitution.您不想使用 """ 因为它不允许变量替换。

As an aside, where possible I would also suggest using 'val' instead of 'var' for performance.顺便说一句,在可能的情况下,我还建议使用 'val' 而不是 'var' 来提高性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM