[英]How to close shared singleton connection between cores in Spark Executor
I am using shared connection between all cores of single executor of Spark. 我在Spark单个执行程序的所有核心之间使用共享连接。 Basically I have created singleton connection object in order to share between cores of single executor so that it will be shared between cores and there will be only 1 connection per executor.
基本上,我创建了单例连接对象,以便在单个执行程序的内核之间共享,以便在内核之间共享,并且每个执行程序只有1个连接。
object SingletonConnection {
private var connection: Connection = null
def getConnection(url: String, username: String, password: String): Connection = synchronized {
if (connection == null) {
connection = DriverManager.getConnection(url, username, password)
}
connection
}
}
Spark executor code: Spark执行程序代码:
dataFrame.foreachPartition { batch =>
if (batch.nonEmpty) {
lazy val dbConnection = SingletonConnection
val dbc = dbConnection.getConnection(url, user, password)
// do some operatoins
st.addBatch()
}
st.executeBatch()
}
}
catch {
case exec: BatchUpdateException =>
var ex: SQLException = exec
while (ex != null) {
ex.printStackTrace()
ex = ex.getNextException
}
throw exec
}
}
}
Problem here is , I cannot close the connection. 问题是,我无法关闭连接。 Since I will not know when particular core finishes its execution.
因为我不知道特定内核何时完成执行。 If i close connection in finally, as soon as one core finishes its task it closes the connection and that causes all other cores to stop since shared connection is closed.
如果我最后关闭连接,则一旦一个核心完成其任务,它就会关闭连接,由于共享连接已关闭,这将导致所有其他核心停止。
Since I am not closing the connection here, the connection remains open even after the task is finished. 由于我没有在此处关闭连接,因此即使任务完成后,该连接仍保持打开状态。 How can I make this process work so that I should be able to close the connection ONLY AFTER ALL CORES HAVE FINISHED THEIR TASKS.
如何使此过程正常进行,以便只有在完成所有任务后才能关闭连接。
I implemented it using Java, so I can just give you some clue. 我使用Java实现了它,因此我可以给您一些提示。
In SingletonConnection class I created a thread-safe accumulator. 在SingletonConnection类中,我创建了一个线程安全的累加器。 Each time the connection is opened, the accumulator is incremented by one.
每次打开连接时,累加器都会加一。 And each time befor closing the connection, the accumulator is decremented by one and check if the accumulator is equals to zero.
并且每次关闭连接时,累加器都会减一,并检查累加器是否等于零。 When the accumulator equals to zero, then you can close the connection.
当累加器等于零时,则可以关闭连接。
This won't close connection when other runnning threads are still using the connection. 当其他运行线程仍在使用连接时,这不会关闭连接。 But this will let you create more connections than you thought(the number of partitions).
但是,这将使您创建的连接数量超出您的想象(分区数)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.