简体   繁体   English

如果两个流程同时修改两个事务中的数据并且在表上存在唯一约束,将会发生什么?

[英]What will happen if two processes modify data in two transactions at the same time and there is a unique constraint on the table?

I am thinking about a race condition in a production system I am working on. 我正在考虑正在研究的生产系统中的竞争状况。 Database is PostgreSQL. 数据库是PostgreSQL。 Application is written in Java, but this is not relevant. 应用程序是用Java编写的,但这无关紧要。

There is a table called "versions", which contains columns "entity_ID" and "version" (and some other fields). 有一个名为“版本”的表,其中包含列“ entity_ID”和“版本”(以及其他一些字段)。 This table contains versions of a certain entity. 该表包含某个实体的版本。

There is an application where user can modify those entities. 用户可以在其中修改这些实体的应用程序。

Every modification of an entity creates a new version to the tabel "versions" (using a trigger). 实体的每次修改都会为表格“版本”创建一个新版本(使用触发器)。 This trigger finds the last version in the same table "versions" and inserts a new row with the same entity_ID, but version = (last version + 1). 此触发器在同一表“版本”中找到最新版本,并插入具有相同entity_ID的新行,但version =(最新版本+ 1)。

There is a nightly job that is run in PostgreSQL every 4:00 that also changes those entities and therefore updates data in the table "versions". 在PostgreSQL中每4:00运行一次夜间工作,该工作也会更改这些实体,因此会更新表“版本”中的数据。 This procedure was designed to finish its work by the morning (before users of the application start to use it), but unfortunately runs into the day. 该过程旨在在早上(在应用程序的用户开始使用它之前)完成其工作,但是不幸的是直到一天。 As this procedure is run in a function, then it is one big transaction. 由于此过程在函数中运行,因此这是一项大事务。 Therefore the changes done by it are not visible to the application. 因此,它所做的更改对于应用程序是不可见的。

The nightly job uses the following workflow : 每晚工作使用以下工作流程

  • Set "failed_counter" = 0 设置“ failed_counter” = 0
  • Iterate over entities that need to be modified 遍历需要修改的实体
  • Do modifications to the entity inside a BEGIN .. EXCEPTION .. END block 对BEGIN .. EXCEPTION .. END块内的实体进行修改
  • If there is an EXCEPTION, increase the "failed_counter". 如果存在例外,请增加“ failed_counter”。 Log the exception and the failed entity to a log table. 将异常和失败的实体记录到日志表中。
  • If "failed_counter" > 10, cancel work. 如果“ failed_counter”> 10,则取消工作。
  • End work 结束工作

This has caused the following race condition to happen a few times (lets assume that X is the last version of entity A): 这导致以下竞争条件发生了几次(假设X是实体A的最新版本):

  1. Nightly job starts 每晚开始工作
  2. Nightly job modifies entity A, creating version X+1 夜间作业修改实体A,创建版本X + 1
  3. Application is used to also modify entity A, creating also version X+1 (because the nightly job transaction has not COMMITed, so the version X+1 is not visible to the application) 应用程序还用于修改实体A,还创建了版本X + 1(因为夜间作业事务尚未提交,因此应用程序看不到版本X + 1)
  4. Nightly job ends, causing COMMIT 夜间工作结束,导致提交
  5. There are now two versions with version number X+1, which causes application to break. 现在有两个版本,版本号为X + 1,这会导致应用程序中断。

I thought that I could just solve the problem by using an UNIQUE CONSTRAINT over fields (entity_ID, version). 我以为我可以通过在字段(entity_ID,版本)上使用UNIQUE CONSTRAINT来解决问题。 I thought that it would cause the application to receive an error (due to violating the UNIQUE CONSTRAINT) at race condition step 3. But I am not sure how does the unique constraint work in this situation. 我认为这将导致应用程序在竞争条件步骤3中收到错误(由于违反了UNIQUE CONSTRAINT)。但是我不确定唯一约束在这种情况下如何工作。 In race condition step 3, when the application adds a version, does the database check the UNIQUE CONSTRAINT? 在竞争条件步骤3中,当应用程序添加版本时,数据库是否检查UNIQUE CONSTRAINT? I suppose not, since the transaction of the nightly process has not been completed. 我想没有,因为每晚处理的事务尚未完成。 If I am correct and the UNIQUE CONSTRAINT is checked only at race condition step 4, when COMMIT is made, then this causes the whole nightly procedure to fail, which is not desired result. 如果我是正确的,并且仅在竞赛条件第4步中检查了UNIQUE CONSTRAINT,则在进行COMMIT时,这将导致整个夜间过程失败,这是不希望的结果。

So, the question is the following. 因此,问题如下。

  • When is the UNIQUE CONSTRAINT checked: At race condition step 3 or race condition step 4? 什么时候检查UNIQUE CONSTRAINT:在竞赛条件第3步或竞赛条件第4步?
  • If the answer to the last question is "race condition 4", then how could I change the design of the system to avoid the above-mentioned problems? 如果最后一个问题的答案是“竞赛条件4”,那么我如何才能更改系统设计以避免上述问题?

By default, unique constraints in PostgreSQL are checked at the end of each statement. 默认情况下,在每个语句的末尾检查PostgreSQL中的唯一约束。 It's easy to test the behavior using psql. 使用psql测试行为很容易。

Some big, red flags . 一些大的红旗。 . .

As this procedure is run in a function, then it is one big transaction. 由于此过程在函数中运行,因此这是一项大事务。

It's not one, big transaction because you're running a function. 这不是一个大事务,因为您正在运行一个函数。 It's one, big transaction because you haven't run the function several times over smaller subsets of the data. 这是一笔很大的交易,因为您没有多次对较小的数据子集运行函数。 Whether you can run the function over subsets is application-dependent. 是否可以对子集运行功能取决于应用程序。

Iterate over entities that need to be modified 遍历需要修改的实体

Rough rule of thumb for SQL databases: iteration is always a mistake. SQL数据库的粗略经验法则:迭代总是一个错误。

SQL is a set-oriented language. SQL是一种面向集合的语言。 Dealing with sets is usually faster than iteration, and often by several orders of magnitude. 处理集合通常比迭代快,通常要快几个数量级。

If "failed_counter" > 10, cancel work. 如果“ failed_counter”> 10,则取消工作。

This looks suspicious. 这看起来很可疑。 Why are nine failures ok? 为什么九个失败还可以? Why are any failures ok? 为什么任何故障好吗?

I thought that I could just solve the problem by using an UNIQUE CONSTRAINT over fields (entity_ID, version). 我以为我可以通过在字段(entity_ID,版本)上使用UNIQUE CONSTRAINT来解决问题。

That you don't already have a unique constraint on those two columns is a big, waving red flag. 那你还没有对这些两列的唯一约束是一个很大的,挥舞着红旗。 Fix this first. 首先解决此问题。

The fact that an application should apparently be waiting for a batch job to finish, but isn't waiting, might or might not be a system design issue. 应用程序显然应该在等待批处理作业完成,而不是在等待,这一事实可能是也可能不是系统设计问题。 (It smells like a system design issue.) (闻起来像是系统设计问题。)

There is a nightly job that is run in PostgreSQL every 4:00 ... 在PostgreSQL中每4:00运行一次夜间工作...

Did you think of starting at 3:00? 您是否考虑从3:00开始?

Test this, but not on your production server. 对此进行测试,但不要在生产服务器上进行测试。

  • Drop the trigger. 放下扳机。
  • Add a column of type timestamp with time zone . 添加timestamp with time zone类型的列。
  • Set that column's default value. 设置该列的默认值。 Most applications will use current_timestamp , but you might want clock_timestamp() instead. 大多数应用程序将使用current_timestamp ,但是您可能需要使用clock_timestamp() Docs 文件
  • Add a unique constraint on {entity_id, new timestamp column}. 在{entity_id,新的时间戳列}上添加唯一约束。

Eliminating the trigger might speed things up enough for you. 消除触发器可能会为您带来足够的速度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM