Databricks - Error writing to Azure Synapse

Question

I am trying to write data to Azure Synapse table with identity field with following code

code on databricks

def get_jdbc_connection(host, sqlDatabase, user, password):
  jdbcHostname = "{}.database.windows.net".format(host)
  jdbc_url = "jdbc:sqlserver://{}:1433;database={};user={}@{};password={};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;".format(jdbcHostname, sqlDatabase, user, host, password)
  url = "jdbc:sqlserver://{}:1433;database={};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;".format(jdbcHostname, sqlDatabase)
  return (jdbc_url,url )

def write_adw(spark, df_target_adw, jdbc_url, table, tempDir, option_mode, pre_Actions ):
    df_target_adw.write.format("com.databricks.spark.sqldw") \
        .option("url", jdbc_url) \
        .option("useAzureMSI", "true") \
        .option("preActions", pre_Actions) \
        .option("dbTable", table) \
        .option("tempDir", tempDir) \
        .mode(option_mode) \
        .save()
    
dftraffic = spark.sql('SELECT distinct SourceName\
              ,1 AS IsActiveRow \
             ,"Pipe-123" as pipelineId \
              ,current_timestamp as ADFCreatedDateTime \
              ,current_timestamp as ADFModifiedDateTime \
              from deltaTable')

#write to ADW
(jdbc_url, url_adw) = get_jdbc_connection(host, sqlDatawarehouse,user, password)
target_table = 'TargetTable_name'
option_mode= "append"
pre_Actions= " SELECT GETDATE()"
write_adw(spark, dftraffic, jdbc_url, target_table, tempDir, option_mode, pre_Actions )

schema of target table on adw

Column Name	Data Type
SourceSID	INT IDENTITY (1,1) NOT NULL
Source Name	VARCHAR(20) NOT NULL
IsRowActive	BIT NOT NULL
PipelineId	VARCHAR(20) NOT NULL
ADFCreatedDateTime	DATETIME NOT NULL
ADFModifiedDateTime	DATETIME NOT NULL

Configuration details on databricks

Databricks runtime 7.4 (includes Apache Spark 3.0.1, Scala 2.12)

Error Message

Py4JJavaError: An error occurred while calling o457.save. : com.databricks.spark.sqldw.SqlDWSideException: Azure Synapse Analytics failed to execute the JDBC query produced by the connector. Underlying SQLException(s): - com.microsoft.sqlserver.jdbc.SQLServerException: An explicit value for the identity column in table can only be specified when a column list is used and IDENTITY_INSERT is ON

the code was working all fine on databricks runtime 6.4 Spark 2.4.5 and I am facing this error as soon I tried to upgrade dbk runtime. How can I get this to work?

Answer 1

Are you not having the extra row "1 AS IsActiveRow ".I dont see that in the Schema

dftraffic = spark.sql('SELECT distinct SourceName\
              ,1 AS IsActiveRow \
             ,"Pipe-123" as pipelineId \
              ,current_timestamp as ADFCreatedDateTime \
              ,current_timestamp as ADFModifiedDateTime \
              from deltaTable)

Databricks - Error writing to Azure Synapse

Question

1 answers

solution1
0 2021-04-29 05:54:03

Databricks - Error writing to Azure Synapse

Question

1 answers

solution1 0 2021-04-29 05:54:03

solution1
0 2021-04-29 05:54:03