简体   繁体   English

为什么 PostgreSQL 会认为两个可序列化的事务之间存在冲突?

[英]Why does PostgreSQL think there is a conflict between the two serializable transactions?

I'm trying to figure out how the serializable isolation level in PostgreSQL works.我试图弄清楚 PostgreSQL 中的可序列化隔离级别是如何工作的。 In theory and according to PostgreSQL's own documentation PostgreSQL should be smart enough to somehow detect serialization conflicts and automatically roll back offending transactions.理论上,根据 PostgreSQL 自己的文档,PostgreSQL 应该足够聪明,可以以某种方式检测序列化冲突并自动回滚违规事务。 Yet when I tried to play with serializable isolation level myself I stumbled upon a lot of false positives and started to doubt my own understanding of the concept of serializability or PostgreSQL's implementation of it.然而,当我自己尝试使用可序列化隔离级别时,我偶然发现了很多误报,并开始怀疑我自己对可序列化概念或 PostgreSQL 对其实现的理解。 Below you can find one of the simplest examples of such false positives:您可以在下面找到此类误报的最简单示例之一:

create table mytab(
    class integer,
    value integer not null
);

create index mytab_class_idx on mytab (class);

insert into mytab (class, value) values (1, 10);
insert into mytab (class, value) values (1, 20);
insert into mytab (class, value) values (2, 100);
insert into mytab (class, value) values (2, 200);

The table data is the following:表数据如下:

 class | value
-------+-------
     1 |    10
     1 |    20
     2 |   100
     2 |   200

Then I run two concurrent transactions.然后我运行两个并发事务。 Step n comments in code show an order in which I execute the statements.代码中的Step n注释显示了我执行语句的顺序。 Following advice from https://stackoverflow.com/a/42303225/3249257 I explicitly disabled sequential scan to force PostgreSQL to use an index:遵循https://stackoverflow.com/a/42303225/3249257 的建议,我明确禁用顺序扫描以强制 PostgreSQL 使用索引:

SET enable_seqscan=off;

Transaction A:交易A:

begin; -- step 1
select sum(value) from mytab where class = 1; -- step 2
insert into mytab(class, value) values (3, 30); -- step 5
commit; -- step 7

Transaction B:交易乙:

begin; -- step 3
select sum(value) from mytab where class = 2; -- step 4
insert into mytab(class, value) values (4, 300); -- step 6
commit; -- step 8

As I understand it, there shoudn't be any conflict between those two transactions.据我了解,这两个交易之间不应该有任何冲突。 They don't touch the same rows.他们不接触相同的行。 However, when I commit the second transaction it fails with this error:但是,当我提交第二个事务时,它失败并显示以下错误:

[40001] ERROR: could not serialize access due to read/write dependencies among transactions
Detail: Reason code: Canceled on identification as a pivot, during commit attempt.
Hint: The transaction might succeed if retried.

What's going on here?这里发生了什么? Is my understanding of serializable isolation level flawed?我对可序列化隔离级别的理解有缺陷吗? Is it a failure of PostgreSQL's heuristics mentioned in this answer https://stackoverflow.com/a/50809788/3249257 ?这个答案https://stackoverflow.com/a/50809788/3249257 中提到的 PostgreSQL 启发式算法失败了吗?

I'm using PostgreSQL 11.5 on x86_64-apple-darwin18.6.0, compiled by Apple LLVM version 10.0.1 (clang-1001.0.46.4), 64-bit .PostgreSQL 11.5 on x86_64-apple-darwin18.6.0, compiled by Apple LLVM version 10.0.1 (clang-1001.0.46.4), 64-bit使用PostgreSQL 11.5 on x86_64-apple-darwin18.6.0, compiled by Apple LLVM version 10.0.1 (clang-1001.0.46.4), 64-bit

The problem here is with predicate locks (SIReadLock) that are used by PostgreSQL to figure out whether there is a conflict between concurrent transactions.这里的问题在于 PostgreSQL 使用谓词锁 (SIReadLock) 来确定并发事务之间是否存在冲突。 If you run the query bellow during the course of transactions' execution, you will see these locks:如果您在事务执行过程中运行下面的查询,您将看到这些锁:

select relation::regclass, locktype, page, tuple, pid from pg_locks
where mode = 'SIReadLock';

In this case, the issue was with page locks on the mytab_class_idx index.在这种情况下,问题出在mytab_class_idx索引上的页面锁定上。 If the concurrent transactions happen to acquire a lock for the same page of mytab_class_idx relation, serialization conflict occurs.如果并发事务碰巧为mytab_class_idx关系的同一页获取了锁,则会发生序列化冲突。 If they acquire locks for different pages, they both commit successfully.如果他们为不同的页面获取锁,他们都成功提交。

If there is not enough data like in the question above, index entries for all rows will fall on the same page and then a serialization conflict will inevitably occur.如果像上面的问题一样没有足够的数据,所有行的索引条目将落在同一页上,然后不可避免地会发生序列化冲突。 For big enough tables serialization conflicts will happen rarely, though not as rare as they could.对于足够大的表,序列化冲突很少发生,尽管不是那么罕见。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM