防止 SQL Server 中的死锁

Question

I have an application connected to a SQL Server 2014 database that combines several rows into one.我有一个连接到 SQL Server 2014 数据库的应用程序，该数据库将多行合并为一个。 There are no other connections to this database while the application is running.应用程序运行时没有与此数据库的其他连接。

First, select a chunk of rows within a specific time span.首先，在特定时间跨度内选择一大块行。 This query uses a non-clustered seek (TIME column) merged with a clustered lookup.此查询使用与集群查找合并的非集群查找（TIME 列）。

select ...
from FOO
where TIME >= @from and TIME < @to and ...

Then, we process these rows in c# and write changes as a single update and multiple deletes, this happens many times per chunk.然后，我们在 c# 中处理这些行并将更改写入单个更新和多个删除，每个块会发生很多次。 These also use non-clustered index seeks.这些也使用非聚集索引查找。

begin tran

update FOO set ...
where NON_CLUSTERED_ID = @id

delete FOO where NON_CLUSTERED_ID in (@id1, @id2, @id3, ...)

commit

I am getting deadlocks when running this with multiple parallel chunks.使用多个并行块运行此程序时，我遇到了死锁。 I tried using ROWLOCK for the update and delete but that caused even more deadlocks than before for some reason, even though there are no overlaps between chunks.我尝试使用ROWLOCK进行update和delete ，但由于某种原因，这导致了比以前更多的死锁，即使块之间没有重叠。

Then I tried TABLOCKX, HOLDLOCK on the update , but that means I can't perform my select in parallel so I'm losing the advantages of parallelism.然后我在update上尝试TABLOCKX, HOLDLOCK ，但这意味着我不能并行执行我的select ，所以我失去了并行性的优势。

Any idea how I can avoid deadlocks but still process multiple parallel chunks?知道如何避免死锁但仍然处理多个并行块吗？

Would it be safe to use NOLOCK on my select in this case, given there is no row overlap between chunks?在这种情况下，在我的select上使用NOLOCK是否安全，因为块之间没有行重叠？ Then TABLOCKX, HOLDLOCK would only block the update and delete , correct?那么TABLOCKX, HOLDLOCK只会阻止update和delete ，对吗？

Or should I just accept that deadlocks will happen and retry the query in my application?还是我应该接受会发生死锁并在我的应用程序中重试查询？

UPDATE (additional information): All deadlocks so far have happened in the update and delete phase, none in the select . UPDATE （附加信息）：到目前为止，所有死锁都发生在update和delete阶段，在select中没有。 I'll try to get some deadlock logs up if I can't get this solved today (the correct trace flags weren't enabled before).如果我今天不能解决这个问题，我会尝试获取一些死锁日志（之前没有启用正确的跟踪标志）。

UPDATE : These are the two arrangements of deadlocks that occur with ROWLOCK , they both refer only to the delete statement and the non-clustered index it uses.更新：这些是ROWLOCK发生的两种死锁安排，它们都只指delete语句和它使用的非聚集索引。 I'm not sure if these are the same as the deadlocks that occur without any table hints as I wasn't able to reproduce any of those.我不确定这些是否与没有任何表提示的死锁相同，因为我无法重现其中任何一个。

Ask if there's anything else needed from the .xdl, I'm a bit weary of attaching the whole thing.询问.xdl 是否还需要其他任何东西，我有点厌倦了附加整个东西。

Answer 1

The general advice regarding deadlocks: make sure you do everything in the same order, ie acquire locks in the same order, for different processes.关于死锁的一般建议：确保您以相同的顺序执行所有操作，即针对不同的进程以相同的顺序获取锁。

You can find the same advice in this technical article on microsoft.com regarding Minimizing Deadlocks .您可以在 microsoft.com 上的这篇关于最小化死锁的技术文章中找到相同的建议。 There's a good reason it is listed first.它被列在第一位是有充分理由的。

Access objects in the same order.以相同的顺序访问对象。

Avoid user interaction in transactions.避免交易中的用户交互。

Keep transactions short and in one batch.保持交易简短，一批。

Use a lower isolation level.使用较低的隔离级别。

Use a row versioning-based isolation level.使用基于行版本控制的隔离级别。

Set READ_COMMITTED_SNAPSHOT database option ON to enable read-committed transactions to use row versioning.将 READ_COMMITTED_SNAPSHOT 数据库选项设置为 ON 以启用已提交读事务以使用行版本控制。

Use snapshot isolation.使用快照隔离。

Use bound connections.使用绑定连接。

Update after question from Cato:卡托提问后更新：

How would acquiring locks in the same order apply here?在这里如何以相同的顺序获取锁？ Have you got any advice on how he would change his SQL to do that?你对他如何改变他的 SQL 来做到这一点有什么建议吗？

Deadlocks are always the same, no matter what environment: two processes (say A & B ) acquire multiple locks (say X & Y ) in a different order so that A is waiting for Y and B is waiting for X while A is holding X and B is holding Y .无论在什么环境下，死锁总是相同的：两个进程（比如A和B ）以不同的顺序获取多个锁（比如X和Y ），因此A正在等待Y而B正在等待X而A持有X B持有Y 。

It applies here because DELETE and UPDATE statements implicitely acquire locks on the rows or index range or table (depending on what the engine deems appropriate).它适用于此处，因为DELETE和UPDATE语句隐含地获取行或索引范围或表上的锁（取决于引擎认为合适的内容）。

You should analyze your process and see if there are scenarios where locks could be acquired in a different order.您应该分析您的流程并查看是否存在可以以不同顺序获取锁的情况。 If that doesn't reveal anything, you can analyze deadlocks using the SQL Server Profiler :如果这没有显示任何内容，您可以使用 SQL Server Profiler 分析死锁：

To trace deadlock events, add the Deadlock graph event class to a trace.要跟踪死锁事件，请将死锁图事件类添加到跟踪中。 This event class populates the TextData data column in the trace with XML data about the process and objects that are involved in the deadlock.此事件类使用有关死锁中涉及的进程和对象的 XML 数据填充跟踪中的 TextData 数据列。 SQL Server Profiler can extract the XML document to a deadlock XML (.xdl) file which you can view later in SQL Server Management Studio. SQL Server Profiler 可以将 XML 文档提取到死锁 XML (.xdl) 文件中，稍后您可以在 SQL Server Management Studio 中查看该文件。 You can configure SQL Server Profiler to extract Deadlock graph events to a single file that contains all Deadlock graph events, or to separate files.您可以将 SQL Server Profiler 配置为将死锁图事件提取到包含所有死锁图事件的单个文件中，或者提取到单独的文件中。

Answer 2

I'd use sp_getapplock in the updating transaction to prevent multiple instances of this code running in parallel.我会在更新事务中使用sp_getapplock来防止此代码的多个实例并行运行。 This will not block the selecting statement as table locking hints do.这不会像表锁定提示那样阻塞选择语句。

You still should program the retrying logic, because it may take a while to acquire the lock, longer than the timeout parameter.您仍然应该编写重试逻辑，因为获取锁可能需要一段时间，比超时参数更长。

This is how the updating transaction can be wrapped into sp_getapplock .这就是将更新事务包装到sp_getapplock中的方式。

BEGIN TRANSACTION;
BEGIN TRY

    DECLARE @VarLockResult int;
    EXEC @VarLockResult = sp_getapplock
        @Resource = 'some_unique_name_app_lock',
        @LockMode = 'Exclusive',
        @LockOwner = 'Transaction',
        @LockTimeout = 60000,
        @DbPrincipal = 'public';

    IF @VarLockResult >= 0
    BEGIN
        -- Acquired the lock
        update FOO set ...
        where NON_CLUSTERED_ID = @id

        delete FOO where NON_CLUSTERED_ID in (@id1, @id2, @id3, ...)

    END ELSE BEGIN
        -- return some error code, so that the caller could retry
    END;

    COMMIT TRANSACTION;
END TRY
BEGIN CATCH
    ROLLBACK TRANSACTION;
    -- handle the error
END CATCH;

The selecting statement doesn't need any changes.选择语句不需要任何更改。

I would recommend against NOLOCK , even though you say that IDs in chunks do not overlap.我建议不要使用NOLOCK ，即使您说块中的 ID 不重叠。 With this hint the SELECT query can skip some pages that are being changed, it can read some pages twice.有了这个提示，SELECT 查询可以跳过一些正在更改的页面，它可以读取一些页面两次。 It is unlikely that such behavior can be tolerated.这种行为是不可能被容忍的。

Answer 3

Kindly use get applock in such format in code.请在代码中以这种格式使用 get applock。 The stored procedure sp_getapplock puts the lock on the application resource .存储过程 sp_getapplock 将锁放在应用程序资源上。

EXEC Sp_getapplock @Resource = 'storeprocedurename', @LockMode = 'Exclusive', @LockOwner = 'Transaction', @LockTimeout = 25000 EXEC Sp_getapplock @Resource = 'storeprocedurename'，@LockMode = 'Exclusive'，@LockOwner = 'Transaction'，@LockTimeout = 25000

It is very helpful.这是非常有帮助的。 Kindly increase LockTimeout to reduce deadlock请增加 LockTimeout 以减少死锁

防止 SQL Server 中的死锁

问题描述

3 个解决方案

解决方案1
3 2016-11-14 09:36:03

解决方案2
2 2016-11-14 09:36:49

解决方案3
0 2022-06-07 04:13:22

防止 SQL Server 中的死锁

问题描述

3 个解决方案

解决方案1 3 2016-11-14 09:36:03

解决方案2 2 2016-11-14 09:36:49

解决方案3 0 2022-06-07 04:13:22

解决方案1
3 2016-11-14 09:36:03

解决方案2
2 2016-11-14 09:36:49

解决方案3
0 2022-06-07 04:13:22