简体   繁体   中英

Delete rows from Azure Sql table using Azure Databricks with Scala

I am using Azure Databricks with Scala and my goal is to delete some rows from the Azure SQL table.

To achieve this, I am using a pushdown query with JDBC as follows:

val pushdown_query = s"(DELETE FROM ${table_name} WHERE dump_date = '2020-01-07') temp"
val res = spark.read.jdbc(jdbcUrl, pushdown_query, connectionProperties)  

However, I am getting the following error:

com.microsoft.sqlserver.jdbc.SQLServerException: A nested INSERT, UPDATE, DELETE, or MERGE statement must have an OUTPUT clause.

I added OUTPUT clause to the pushdown query to solve this:

val pushdown_query = s"(DELETE FROM ${table_name} OUTPUT DELETED.dump_date WHERE dump_date = '2020-01-07') temp"

But now I am getting the following error:

com.microsoft.sqlserver.jdbc.SQLServerException: A nested INSERT, UPDATE, DELETE, or MERGE statement is not allowed in a SELECT statement that is not the immediate source of rows for an INSERT statement.

What am I doing wrong? How can I achieve this? Is there a better way?

Thanks in advance.

I haven't found a way to use Spark to delete rows from Azure SQL, but I have implemented my own function in Scala using Java libraries:

import java.util.Properties
import java.sql.Connection
import java.sql.DatabaseMetaData
import java.sql.DriverManager
import java.sql.SQLException
import java.sql.Date
import java.time.LocalDate

  
// Set credentials
var jdbcUsername = "X"
var jdbcPassword = dbutils.secrets.get("X", "Y")

// Chech that the JDBC driver is available
Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver")

// Create the JDBC URL
var jdbcHostname = "X"
var jdbcPort = 1433
var jdbcDatabase = "X"
val jdbcUrl = s"jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30"

def delete_dump_date(table_name:String, dump_date:String){

  val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
  var connObj:Connection = null
  var number_of_rows_deleted:Int = 0
  try{
      Class.forName(driverClass);
      connObj = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword);
      val statement = connObj.prepareStatement(String.format("DELETE FROM %s WHERE dump_date=?", table_name))
      try{
          statement.setDate(1, Date.valueOf(LocalDate.parse(dump_date)));
          number_of_rows_deleted = statement.executeUpdate();
      }
      finally{
          statement.close();
          println(number_of_rows_deleted + " rows deleted.")
      }
  }
  catch {
      case e:SQLException => e.printStackTrace();
  }
  finally{
      connObj.close();
  }
}

And you can call the function:

delete_dump_date(table_name, '2020-01-07')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM