简体   繁体   中英

Is this a 'thread-safe' way to update and return a row in SQL from a queue?

I'm implementing a queue in SQL by having a table that has a 'claimedby' column. Any one of a number of processes can grab an item from this queue by populating the claimedby field.

What I want to do is prevent two processes grabbing the same item.

I've done some research and found a few possibilities of how to do this. I don't want to use the broker, I do want to stick with a simple table.

I am using C# to work on the item that is retrieved.

I've been playing around in SQL and written the following:

declare @test table (ID int);

UPDATE TOP (1) CatQueueItem SET ClaimedBy = 'Mittens'
OUTPUT inserted.CatQueueItemIdId into @test
WHERE ClaimedBy is null

This will be done as a single query in C# by adding the parameter as an output parameter, etc.

I'm wondering if this code will work or if I need to think about locking and transanctions to ensure that if multiple processes run this query simultaneously that it will function as expected (only one process will claim the item, the other updates will skip this row entirely due to it being updated by the first).

Any ideas?

While the query given will not give a row to multiple threads due to an implicit transaction that is created, it can cause issues where one thread manages to hog the queue. If you want to stop this you might want to add the READ PAST and ROW LOCK hints to the query. This prevents the lock that one thread gets from blocking the other threads from getting a row.

For example:

UPDATE TOP (1) CatQueueItem WITH (READPAST, ROWLOCK)
SET ClaimedBy = 'Mittens'
OUTPUT inserted.CatQueueItemIdId into @test
WHERE ClaimedBy is null

The Long Explanation

All statements in SQL Server run within the context of a Transaction. If a statement is run and there is no transaction specified SQL Server creates an implicit transaction just for that single statement.

There are three main types of locks that SQL Server uses Shared (S), Update (U) and Exclusive (X). Locks are acquired within the scope of a Transaction. If a transaction has an S lock on a row other transactions can get an S or U lock but not an X lock on the same row. If a transaction has a U lock on a row other transactions can only get S locks on that row. If a transaction has an X lock on a row no other transaction can get a lock on the row. In order to write to a row a transaction needs to have an X lock as this has the effect of blocking all other transactions from reading the row while it is part way through updating it.

This gives the following compatibility table:

    S  U  X
  ----------
S | Y  Y  N
U | Y  N  N
X | N  N  N

The update that is in the question has two parts to it. First it does a read of the table to find a row that has a null in ClaimedBy. Once it has found the row the second part of the operation updates the row that was found.

Normally when reading from tables SQL Server uses S locks as these don't prevent other transactions from also reading the rows and increases read performance, but it does stop other transactions getting X locks to write to the rows. The problem with this is that when the second part of the update query tries to upgrade to an X lock, so it can write to the row, it can cause a deadlock. The reason for this is that the first part of the query in another transaction may have acquired an S lock the same as your transaction but might not have upgraded it yet. This prevents your transaction upgrading its lock to an X lock and your S lock prevents the other transaction upgrading either. Neither of the transactions will be able to succeed so they deadlock. In this case SQL Server picks one transaction and rolls it back.

To stop the deadlock happening, when doing the read part of an update statement SQL Server uses U locks. U locks allow other transactions to acquire S locks allowing those that are doing reads to succeed, but it does not allow other U locks. By using a U lock you are saying that you are only reading, but you have the intention of writing at some point in the future. This prevents the situation where you have two transactions that are both trying to upgrade to an X lock. As such the transaction that has the U lock can upgrade to X safe in the knowledge that it won't deadlock with another transaction by doing so.

The relevance of all of this to the scenario given in the question is that the U locks used by one thread's transaction to lock rows while searching for an available row block out all the other thread's transactions. This is because when a transaction tries to acquire a lock on a row that already has an incompatible lock on it, it simply waits in a queue until the blocking lock is unlocked. Because all the threads are searching the same rows for a free one they all try to get U locks on the same row and all form up into an orderly queue waiting to get a U lock on the same row. In other words only one thread's transaction is allowed to look for free rows at a time.

What the READPAST table hint does is it stops the transactions queueing to read rows in the table. With READPAST when a transaction tires to acquire a lock on a row that is already locked instead of joining a queue for the lock it says stuff this and goes and tries the next row. In this case it will say I don't know if the row has a ClaimedBy value or not, I'm not prepared to wait to find out, so I will just assume that it does and try the next row. This could mean that it skips available rows but won't get an unavailable row. This will improve the speed at which the threads and their transactions can obtain items from the queue as they can all look for available rows at the same time.

Acquiring locks can potentially be quite expensive. It takes time and memory. In order to combat this issue SQL Server has several granularities for locks. You can lock the whole database, a whole table, a page of a table or a row of a table. The query optimizer will try to use statistics to predict how many rows need to be locked. If there are a lot, it will pick a page or table lock instead of a row lock. This has the effect of it requiring fewer locks overall.

The ROWLOCK table hint tells SQL Server to not use these coarser grained locks and only use row locks. This is advantages in this case because it stops big chunks of available rows being skipped by the transactions that are looking for available rows.

Every query in SQL Server operates within an implicit transaction. Since you are using a single statement here (not multiple queries) the engine will handle locking and blocking for you.

As long as you are handling the case where no record is updated in your C# code this should handle concurrency just fine.

Another option might be to use the UPDATE statement to update the table and capture the claimed ID into a variable, both in the same single statement.

This example will capture CatQueueItem.ID from the row being claimed into variable @ClaimedID , and update CatQueueItem.ClaimedBy in one atomic operation.

DECLARE @ClaimedId INT

UPDATE TOP (1) q SET
       @ClaimedId = q.ID,
       ClaimedBy = 'Mittens'
  FROM CatQueueItem q
 WHERE ClaimedBy IS NULL
  1. TOP (1) ensures we only take one item
  2. @ClaimedId = q.ID captures the ID of the item we are about to claim
  3. ClaimedBy = 'Mittens' marks the item as claimed
  4. WHERE ClaimedBy IS NULL restricts us to only unclaimed items

If there are no items available to claim, @ClaimedId will be NULL.

If you want to ensure items are claimed in the order they were inserted (a true queue), then you should add a clustered index on ID - although the index is not required for the queue to simply be atomic.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM