简体   繁体   English

postgresql 生成没有间隙的序列

[英]postgresql generate sequence with no gap

I must / have to create unique ID for invoices.我必须/必须为发票创建唯一 ID。 I have a table id and another column for this unique number.我有一个表 id 和这个唯一编号的另一列。 I use serialization isolation level.我使用序列化隔离级别。 Using使用

  var seq = @"SELECT invoice_serial + 1 FROM  invoice WHERE ""type""=@type ORDER BY invoice_serial DESC LIMIT 1";

Doesn't help because even using FOR UPDATE it wont read correct value as in serialization level.没有帮助,因为即使使用 FOR UPDATE 它也不会像序列化级别那样读取正确的值。

Only solution seems to put some retry code.唯一的解决方案似乎放了一些重试代码。

Sequences do not generate gap-free sets of numbers, and there's really no way of making them do that because a rollback or error will "use" the sequence number.序列不会生成无间隙的数字集,实际上没有办法让它们这样做,因为回滚或错误会“使用”序列号。

I wrote up an article on this a while ago.不久前我写了一篇关于这个的文章。 It's directed at Oracle but is really about the fundamental principles of gap-free numbers, and I think the same applies here.它针对 Oracle,但实际上是关于无间隙数字的基本原则,我认为这同样适用于这里。

Well, it's happened again.嗯,它又发生了。 Someone has asked how to implement a requirement to generate a gap-free series of numbers and a swarm of nay-sayers have descended on them to say (and here I paraphrase slightly) that this will kill system performance, that's it's rarely a valid requirement, that whoever wrote the requirement is an idiot blah blah blah.有人如何实现生成无间隙数字系列的要求,一群反对者纷纷向他们表示(在这里我稍微解释一下)这会降低系统性能,这很少是有效的要求,那个写要求的人是个白痴等等。

As I point out on the thread, it is sometimes a genuine legal requirement to generate gap-free series of numbers.正如我在线程中指出的那样,生成无间隙数字序列有时是真正的法律要求。 Invoice numbers for the 2,000,000+ organisations in the UK that are VAT (sales tax) registered have such a requirement, and the reason for this is rather obvious: that it makes it more difficult to hide the generation of revenue from tax authorities.英国超过 2,000,000 家注册了 VAT(销售税)的组织的发票编号有这样的要求,其原因很明显:这使得向税务机关隐瞒收入的产生变得更加困难。 I've seen comments that it is a requirement in Spain and Portugal, and I'd not be surprised if it was not a requirement in many other countries.我看到有评论说这是西班牙和葡萄牙的要求,如果许多其他国家/地区没有要求,我也不会感到惊讶。

So, if we accept that it is a valid requirement, under what circumstances are gap-free series* of numbers a problem?那么,如果我们接受这是一个有效的要求,那么在什么情况下数字的无间隙系列*会成为问题? Group-think would often have you believe that it always is, but in fact it is only a potential problem under very particular circumstances.集体思考通常会让你相信它总是如此,但实际上它只是在非常特殊的情况下的潜在问题。

  1. The series of numbers must have no gaps.这一系列数字必须没有间隙。
  2. Multiple processes create the entities to which the number is associated (eg. invoices).多个进程创建与编号相关联的实体(例如发票)。
  3. The numbers must be generated at the time that the entity is created.数字必须在创建实体时生成。

If all of these requirements must be met then you have a point of serialisation in your application, and we'll discuss that in a moment.如果必须满足所有这些要求,那么您的应用程序中有一个序列化点,我们稍后会讨论。

First let's talk about methods of implementing a series-of-numbers requirement if you can drop any one of those requirements.首先让我们谈谈实现一系列数字要求的方法,如果您可以删除这些要求中的任何一个。

If your series of numbers can have gaps (and you have multiple processes requiring instant generation of the number) then use an Oracle Sequence object.如果您的数字系列可能有间隙(并且您有多个进程需要立即生成数字),则使用 Oracle Sequence 对象。 They are very high performance and the situations in which gaps can be expected have been very well discussed.它们具有非常高的性能,并且已经很好地讨论了可以预期存在差距的情况。 It is not too challenging to minimise the amount of numbers skipped by making design efforts to minimise the chance of a process failure between generation of the number and commiting the transaction, if that is important.如果这很重要,那么通过设计努力来最小化在生成数字和提交事务之间过程失败的可能性来最小化跳过的数字量并不太具有挑战性。

If you do not have multiple processes creating the entities (and you need a gap-free series of numbers that must be instantly generated), as might be the case with the batch generation of invoices, then you already have a point of serialisation.如果您没有创建实体的多个流程(并且您需要一个必须立即生成的无间隙数字系列),就像批量生成发票的情况一样,那么您已经有了一个序列化点。 That in itself may not be a problem, and may be an efficient way of performing the required operation.这本身可能不是问题,并且可能是执行所需操作的有效方式。 Generating the gap-free numbers is rather trivial in this case.在这种情况下,生成无间隙数字相当简单。 You can read the current maximum value and apply an incrementing value to every entity with a number of techniques.您可以使用多种技术读取当前最大值并将递增值应用于每个实体。 For example if you are inserting a new batch of invoices into your invoice table from a temporary working table you might:例如,如果您要从临时工作表将一批新发票插入到您的发票表中,您可能会:

insert into
  invoices
    (
    invoice#,
    ...)
with curr as (
  select Coalesce(Max(invoice#)) max_invoice#
  from   invoices)
select
  curr.max_invoice#+rownum,
  ...
from
  tmp_invoice
  ...

Of course you would protect your process so that only one instance can run at a time (probably with DBMS_Lock if you're using Oracle), and protect the invoice# with a unique key contrainst, and probably check for missing values with separate code if you really, really care.当然,您会保护您的流程,以便一次只能运行一个实例(如果您使用的是 Oracle,则可能使用 DBMS_Lock),并使用唯一的密钥约束保护发票#,并且可能使用单独的代码检查缺失值,如果你真的,真的很在乎。

If you do not need instant generation of the numbers (but you need them gap-free and multiple processes generate the entities) then you can allow the entities to be generated and the transaction commited, and then leave generation of the number to a single batch job.如果您不需要即时生成数字(但您需要它们无间隙且多个进程生成实体),那么您可以允许生成实体并提交事务,然后将数字的生成留给单个批次工作。 An update on the entity table, or an insert into a separate table.对实体表的更新,或插入到单独的表中。

So if we need the trifecta of instant generation of a gap-free series of numbers by multiple processes?那么,如果我们需要通过多个进程即时生成无间隙数字系列的三重奏? All we can do is to try to minimise the period of serialisation in the process, and I offer the following advice, and welcome any additional advice (or counter-advice of course).我们所能做的就是尽量减少过程中的连载时间,我提供以下建议,并欢迎任何其他建议(当然也可以是反建议)。

  1. Store your current values in a dedicated table.将您的当前值存储在专用表中。 DO NOT use a sequence.不要使用序列。
  2. Ensure that all processes use the same code to generate new numbers by encapsulating it in a function or procedure.通过将其封装在函数或过程中,确保所有进程使用相同的代码来生成新数字。
  3. Serialise access to the number generator with DBMS_Lock, making sure that each series has it's own dedicated lock.使用 DBMS_Lock 序列化对数字生成器的访问,确保每个系列都有自己的专用锁。
  4. Hold the lock in the series generator until your entity creation transaction is complete by releasing the lock on commit在系列生成器中保持锁定,直到通过在提交时释放锁定来完成实体创建事务
  5. Delay the generation of the number until the last possible moment.将数字的生成延迟到最后可能的时刻。
  6. Consider the impact of an unexpected error after generating the number and before the commit is completed — will the application rollback gracefully and release the lock, or will it hold the lock on the series generator until the session disconnects later?考虑在生成数字之后和提交完成之前意外错误的影响——应用程序会优雅地回滚并释放锁,还是会在系列生成器上保持锁直到会话稍后断开连接? Whatever method is used, if the transaction fails then the series number(s) must be “returned to the pool”.无论使用什么方法,如果交易失败,那么序列号必须“返回到池中”。
  7. Can you encapsulate the whole thing in a trigger on the entity's table?你能把整个事情封装在实体表上的触发器中吗? Can you encapsulate it in a table or other API call that inserts the row and commits the insert automatically?您能否将其封装在表或其他 API 调用中,以自动插入行并提交插入?

Original article 来源文章

You could create a sequence with no cache , then get the next value from the sequence and use that as your counter.您可以创建一个没有缓存的序列,然后从序列中获取下一个值并将其用作计数器。

CREATE SEQUENCE invoice_serial_seq START 101 CACHE 1;
SELECT nextval('invoice_serial_seq');

More info here更多信息在这里

You either lock the table to inserts, and/or need to have retry code.您要么将表锁定为插入,和/或需要重试代码。 There's no other option available.没有其他选择。 If you stop to think about what can happen with:如果你停下来想一想会发生什么:

  1. parallel processes rolling back并行进程回滚
  2. locks timing out锁超时

you'll see why.你会明白为什么。

2006 年,有人在 PostgreSQL 邮件列表中发布了一个gapless-sequence 解决方案: http : //www.postgresql.org/message-id/44E376F6.7010802@seaworthysys.com

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM