简体   繁体   English

数据库中的原子比较和交换

[英]atomic compare and swap in a database

I am working on a work queueing solution. 我正在研究工作排队解决方案。 I want to query a given row in the database, where a status column has a specific value, modify that value and return the row, and I want to do it atomically, so that no other query will see it: 我想查询数据库中的给定行,其中状态列具有特定值,修改该值并返回该行,我想以原子方式执行此操作,以便其他任何查询都不会看到它:


begin transaction
select * from table where pk = x and status = y
update table set status = z where pk = x
commit transaction
--(the row would be returned)

it must be impossible for 2 or more concurrent queries to return the row (one query execution would see the row while its status = y) -- sort of like an interlocked CompareAndExchange operation. 2个或更多并发查询必须不能返回该行(一个查询执行会看到该行,而其状态= y) - 有点像互锁的CompareAndExchange操作。

I know the code above runs (for SQL server), but will the swap always be atomic? 我知道上面的代码运行(对于SQL服务器),但交换总是原子的吗?

I need a solution that will work for SQL Server and Oracle 我需要一个适用于SQL Server和Oracle的解决方案

Is PK the primary key? PK是主键吗? Then this is a non issue, if you already know the primary key there is no sport. 那么这是一个非问题,如果你已经知道主键没有运动。 If pk is the primary key, then this begs the obvious question how do you know the pk of the item to dequeue... 如果pk 主键,那么这就引出了一个显而易见的问题: 怎么知道要出列的项目的pk ...

The problem is if you don't know the primary key and want to dequeue the next 'available' (ie. status = y) and mark it as dequeued (delete it or set status = z). 问题是如果您知道主键并且想要将下一个'可用'(即状态= y)出列并将其标记为已出列(删除它或设置status = z)。

The proper way to do this is to use a single statement. 正确的方法是使用单个语句。 Unfortunately the syntax differs between Oracle and SQL Server. 不幸的是,Oracle和SQL Server之间的语法不同。 The SQL Server syntax is: SQL Server语法是:

update top (1) [<table>]
set status = z 
output DELETED.*
where  status = y;

I'm not familiar enough with Oracle's RETURNING clause to give an example similar to SQL's OUTPUT one. 我对Oracle的RETURNING子句不太熟悉,给出了一个类似于SQL的OUTPUT的例子。

Other SQL Server solutions require lock hints on the SELECT (with UPDLOCK) to be correct. 其他SQL Server解决方案要求SELECT(使用UPDLOCK)上的锁提示是正确的。 In Oracle the preffered avenue is use the FOR UPDATE, but that does not work in SQL Server since FOR UPDATE is to be used in conjunction with cursors in SQL. 在Oracle中,preffered avenue使用FOR UPDATE,但这在SQL Server中不起作用,因为FOR UPDATE将与SQL中的游标一起使用。

In any case, the behavior you have in the original post is incorrect. 在任何情况下,您在原始帖子中的行为都是不正确的。 Multiple sessions can all select the same row(s) and even all update it, returning the same dequeued item(s) to multiple readers. 多个会话都可以选择相同的行,甚至全部更新它,将相同的出列项返回给多个读者。

I have some applications that follow a similar pattern. 我有一些应用程序遵循类似的模式。 There is a table like yours that represents a queue of work. 像你这样的表代表了一个工作队列。 The table has two extra columns: thread_id and thread_date. 该表有两个额外的列:thread_id和thread_date。 When the app asks for work froom the queue, it submits a thread id. 当应用程序从队列中请求工作时,它会提交一个线程ID。 Then a single update statement updates all applicable rows with the thread id column with the submitted id and the thread date column with the current time. 然后,单个更新语句将使用提交的ID的线程标识列和具有当前时间的线程日期列更新所有适用的行。 After that update, it selects all rows with that thread id. 在更新之后,它选择具有该线程ID的所有行。 This way you dont need to declare an explicit transaction. 这样您就不需要声明显式事务。 The "locking" occurs in the initial update. “锁定”发生在初始更新中。

The thread_date column is used to ensure that you do not end up with orphaned work items. thread_date列用于确保您不会以孤立的工作项结束。 What happens if items are pulled from the queue and then your app crashes? 如果项目从队列中拉出然后您的应用程序崩溃会发生什么? You have to have the ability to try those work items again. 您必须能够再次尝试这些工作项。 So you might grab all items off the queue that have not been marked completed but have been assigned to a thread with a thread date in the distant past. 因此,您可以从队列中获取尚未标记为已完成但已分配给具有远程过去的线程日期的线程的所有项目。 Its up to you to define "distant." 由你来定义“遥远”。

As a general rule, to make an operation like this atomic you'll need to ensure that you set an exclusive (or update) lock when you perform the select so that no other transaction can read the row before your update. 作为一般规则,要执行类似于此操作的操作,您需要确保在执行select时设置独占(或更新)锁定,以便在更新之前没有其他事务可以读取该行。

The typical syntax for this is something like: 典型的语法是这样的:

 select * from table where pk = x and status = y for update

but you'd need to look it up to be sure. 但是你需要仔细查看才能确定。

Try this. 试试这个。 The validation is in the UPDATE statement. 验证在UPDATE语句中。

Code

IF EXISTS (SELECT * FROM sys.tables WHERE name = 't1')
    DROP TABLE dbo.t1
GO
CREATE TABLE dbo.t1 (
    ColID       int         IDENTITY,
    [Status]    varchar(20)
)
GO

DECLARE @id             int
DECLARE @initialValue   varchar(20)
DECLARE @newValue       varchar(20)

SET @initialValue = 'Initial Value'

INSERT INTO dbo.t1 (Status) VALUES (@initialValue)
SELECT @id = SCOPE_IDENTITY()

SET @newValue = 'Updated Value'

BEGIN TRAN

UPDATE dbo.t1
SET
    @initialValue = [Status],
    [Status]      = @newValue
WHERE ColID    = @id
  AND [Status] = @initialValue

SELECT ColID, [Status] FROM dbo.t1

COMMIT TRAN

SELECT @initialValue AS '@initialValue', @newValue AS '@newValue'

Results 结果

ColID Status
----- -------------
    1 Updated Value

@initialValue @newValue
------------- -------------
Initial Value Updated Value

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM