[英]postgresql generate sequence with no gap
I must / have to create unique ID for invoices.我必须/必须为发票创建唯一 ID。 I have a table id and another column for this unique number.
我有一个表 id 和这个唯一编号的另一列。 I use serialization isolation level.
我使用序列化隔离级别。 Using
使用
var seq = @"SELECT invoice_serial + 1 FROM invoice WHERE ""type""=@type ORDER BY invoice_serial DESC LIMIT 1";
Doesn't help because even using FOR UPDATE it wont read correct value as in serialization level.没有帮助,因为即使使用 FOR UPDATE 它也不会像序列化级别那样读取正确的值。
Only solution seems to put some retry code.唯一的解决方案似乎放了一些重试代码。
Sequences do not generate gap-free sets of numbers, and there's really no way of making them do that because a rollback or error will "use" the sequence number.序列不会生成无间隙的数字集,实际上没有办法让它们这样做,因为回滚或错误会“使用”序列号。
I wrote up an article on this a while ago.不久前我写了一篇关于这个的文章。 It's directed at Oracle but is really about the fundamental principles of gap-free numbers, and I think the same applies here.
它针对 Oracle,但实际上是关于无间隙数字的基本原则,我认为这同样适用于这里。
Well, it's happened again.
嗯,它又发生了。 Someone has asked how to implement a requirement to generate a gap-free series of numbers and a swarm of nay-sayers have descended on them to say (and here I paraphrase slightly) that this will kill system performance, that's it's rarely a valid requirement, that whoever wrote the requirement is an idiot blah blah blah.
有人问如何实现生成无间隙数字系列的要求,一群反对者纷纷向他们表示(在这里我稍微解释一下)这会降低系统性能,这很少是有效的要求,那个写要求的人是个白痴等等。
As I point out on the thread, it is sometimes a genuine legal requirement to generate gap-free series of numbers.
正如我在线程中指出的那样,生成无间隙数字序列有时是真正的法律要求。 Invoice numbers for the 2,000,000+ organisations in the UK that are VAT (sales tax) registered have such a requirement, and the reason for this is rather obvious: that it makes it more difficult to hide the generation of revenue from tax authorities.
英国超过 2,000,000 家注册了 VAT(销售税)的组织的发票编号有这样的要求,其原因很明显:这使得向税务机关隐瞒收入的产生变得更加困难。 I've seen comments that it is a requirement in Spain and Portugal, and I'd not be surprised if it was not a requirement in many other countries.
我看到有评论说这是西班牙和葡萄牙的要求,如果许多其他国家/地区没有要求,我也不会感到惊讶。
So, if we accept that it is a valid requirement, under what circumstances are gap-free series* of numbers a problem?
那么,如果我们接受这是一个有效的要求,那么在什么情况下数字的无间隙系列*会成为问题? Group-think would often have you believe that it always is, but in fact it is only a potential problem under very particular circumstances.
集体思考通常会让你相信它总是如此,但实际上它只是在非常特殊的情况下的潜在问题。
If all of these requirements must be met then you have a point of serialisation in your application, and we'll discuss that in a moment.
如果必须满足所有这些要求,那么您的应用程序中有一个序列化点,我们稍后会讨论。
First let's talk about methods of implementing a series-of-numbers requirement if you can drop any one of those requirements.
首先让我们谈谈实现一系列数字要求的方法,如果您可以删除这些要求中的任何一个。
If your series of numbers can have gaps (and you have multiple processes requiring instant generation of the number) then use an Oracle Sequence object.
如果您的数字系列可能有间隙(并且您有多个进程需要立即生成数字),则使用 Oracle Sequence 对象。 They are very high performance and the situations in which gaps can be expected have been very well discussed.
它们具有非常高的性能,并且已经很好地讨论了可以预期存在差距的情况。 It is not too challenging to minimise the amount of numbers skipped by making design efforts to minimise the chance of a process failure between generation of the number and commiting the transaction, if that is important.
如果这很重要,那么通过设计努力来最小化在生成数字和提交事务之间过程失败的可能性来最小化跳过的数字量并不太具有挑战性。
If you do not have multiple processes creating the entities (and you need a gap-free series of numbers that must be instantly generated), as might be the case with the batch generation of invoices, then you already have a point of serialisation.
如果您没有创建实体的多个流程(并且您需要一个必须立即生成的无间隙数字系列),就像批量生成发票的情况一样,那么您已经有了一个序列化点。 That in itself may not be a problem, and may be an efficient way of performing the required operation.
这本身可能不是问题,并且可能是执行所需操作的有效方式。 Generating the gap-free numbers is rather trivial in this case.
在这种情况下,生成无间隙数字相当简单。 You can read the current maximum value and apply an incrementing value to every entity with a number of techniques.
您可以使用多种技术读取当前最大值并将递增值应用于每个实体。 For example if you are inserting a new batch of invoices into your invoice table from a temporary working table you might:
例如,如果您要从临时工作表将一批新发票插入到您的发票表中,您可能会:
insert into
invoices
(
invoice#,
...)
with curr as (
select Coalesce(Max(invoice#)) max_invoice#
from invoices)
select
curr.max_invoice#+rownum,
...
from
tmp_invoice
...
Of course you would protect your process so that only one instance can run at a time (probably with DBMS_Lock if you're using Oracle), and protect the invoice# with a unique key contrainst, and probably check for missing values with separate code if you really, really care.
当然,您会保护您的流程,以便一次只能运行一个实例(如果您使用的是 Oracle,则可能使用 DBMS_Lock),并使用唯一的密钥约束保护发票#,并且可能使用单独的代码检查缺失值,如果你真的,真的很在乎。
If you do not need instant generation of the numbers (but you need them gap-free and multiple processes generate the entities) then you can allow the entities to be generated and the transaction commited, and then leave generation of the number to a single batch job.
如果您不需要即时生成数字(但您需要它们无间隙且多个进程生成实体),那么您可以允许生成实体并提交事务,然后将数字的生成留给单个批次工作。 An update on the entity table, or an insert into a separate table.
对实体表的更新,或插入到单独的表中。
So if we need the trifecta of instant generation of a gap-free series of numbers by multiple processes?
那么,如果我们需要通过多个进程即时生成无间隙数字系列的三重奏? All we can do is to try to minimise the period of serialisation in the process, and I offer the following advice, and welcome any additional advice (or counter-advice of course).
我们所能做的就是尽量减少过程中的连载时间,我提供以下建议,并欢迎任何其他建议(当然也可以是反建议)。
You either lock the table to inserts, and/or need to have retry code.您要么将表锁定为插入,和/或需要重试代码。 There's no other option available.
没有其他选择。 If you stop to think about what can happen with:
如果你停下来想一想会发生什么:
you'll see why.你会明白为什么。
2006 年,有人在 PostgreSQL 邮件列表中发布了一个gapless-sequence 解决方案: http : //www.postgresql.org/message-id/44E376F6.7010802@seaworthysys.com
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.