I am using Azure Databricks with Scala and my goal is to delete some rows from the Azure SQL table.
To achieve this, I am using a pushdown query with JDBC as follows:
val pushdown_query = s"(DELETE FROM ${table_name} WHERE dump_date = '2020-01-07') temp"
val res = spark.read.jdbc(jdbcUrl, pushdown_query, connectionProperties)
However, I am getting the following error:
com.microsoft.sqlserver.jdbc.SQLServerException: A nested INSERT, UPDATE, DELETE, or MERGE statement must have an OUTPUT clause.
I added OUTPUT clause to the pushdown query to solve this:
val pushdown_query = s"(DELETE FROM ${table_name} OUTPUT DELETED.dump_date WHERE dump_date = '2020-01-07') temp"
But now I am getting the following error:
com.microsoft.sqlserver.jdbc.SQLServerException: A nested INSERT, UPDATE, DELETE, or MERGE statement is not allowed in a SELECT statement that is not the immediate source of rows for an INSERT statement.
What am I doing wrong? How can I achieve this? Is there a better way?
Thanks in advance.
I haven't found a way to use Spark to delete rows from Azure SQL, but I have implemented my own function in Scala using Java libraries:
import java.util.Properties
import java.sql.Connection
import java.sql.DatabaseMetaData
import java.sql.DriverManager
import java.sql.SQLException
import java.sql.Date
import java.time.LocalDate
// Set credentials
var jdbcUsername = "X"
var jdbcPassword = dbutils.secrets.get("X", "Y")
// Chech that the JDBC driver is available
Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver")
// Create the JDBC URL
var jdbcHostname = "X"
var jdbcPort = 1433
var jdbcDatabase = "X"
val jdbcUrl = s"jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30"
def delete_dump_date(table_name:String, dump_date:String){
val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
var connObj:Connection = null
var number_of_rows_deleted:Int = 0
try{
Class.forName(driverClass);
connObj = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword);
val statement = connObj.prepareStatement(String.format("DELETE FROM %s WHERE dump_date=?", table_name))
try{
statement.setDate(1, Date.valueOf(LocalDate.parse(dump_date)));
number_of_rows_deleted = statement.executeUpdate();
}
finally{
statement.close();
println(number_of_rows_deleted + " rows deleted.")
}
}
catch {
case e:SQLException => e.printStackTrace();
}
finally{
connObj.close();
}
}
And you can call the function:
delete_dump_date(table_name, '2020-01-07')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.