简体   繁体   English

如何跨多个进程在Python类中创建唯一ID

[英]How to create a unique ID in a Python Class across multiple processes

I want to write an object that, upon instantiation, generates a new ID for each instance. 我想编写一个对象,该对象在实例化时为每个实例生成一个新的ID。 This ID however must be 但是,此ID必须为

  • Generated in a thread and process safe manner 以线程和进程安全的方式生成
  • Unique even across processes (spawned with multiprocessing) 甚至在整个流程中也是唯一的(通过多处理程序生成)

A few non-concerns: 一些无关紧要的问题:

  • This particular object creation is not performance critical so the synchronization overhead posed by this is acceptable. 这种特定的对象创建不是性能关键,因此由此带来的同步开销是可以接受的。
  • IDs must not be serial, though a clean solution usually comes with that. ID不能是串行的,尽管通常附带一个干净的解决方案。
  • We are ignorant enough to not care about python 2 at all. 我们很无知,根本不关心python 2。

There are already some solutions which work only in one process , the most elegant being the use of an itertools.count() object. 已经有一些解决方案仅在一个过程中起作用,最优雅的解决方案是使用itertools.count()对象。 Using id() is no option as it is not guaranteed to be unique. 使用id()是没有选择的,因为它不能保证唯一。 The ideal solution would probably to be a similar object to itertools.count() which holds some static global value across processes. 理想的解决方案可能是与itertools.count()类似的对象,该对象在进程之间拥有一些静态全局值。

Related discussion on our project: https://github.com/coala-analyzer/coala/issues/981 关于我们项目的相关讨论: https : //github.com/coala-analyzer/coala/issues/981

As suggested by @VPfB, use UUID . 根据@VPfB的建议,使用UUID UUID is an acronym for Universally Unique Identifier . UUID是通用唯一标识符的缩写。 Technically, the ids can only be as unique as the available bit space used to store them. 从技术上讲,这些ID只能与用于存储它们的可用位空间一样唯一。 Traditionally UUID are 128 bits. 传统上,UUID是128位。 The Wikipedia article on the topic discusses their uniqueness : 有关该主题Wikipedia文章讨论了它们的独特性

To put these numbers into perspective, the annual risk of a given person being hit by a meteorite is estimated to be one chance in 17 billion, which means the probability is about 0.00000000006 (6 × 10−11), equivalent to the odds of creating a few tens of trillions of UUIDs in a year and having one duplicate. 为了正确理解这些数字,估计某人被陨石击中的年风险是170亿的一次机会,这意味着该概率约为0.00000000006(6×10-11),等于创造机会的几率。一年中有数十万亿个UUID,并且有一个重复项。 In other words, only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%. 换句话说, 只有在接下来的100年中每秒生成10亿个UUID之后才创建一个副本的可能性约为50%。

An alternate solution is to use a dedicated system to generate a sequence (similar to a database generating a primary key). 一种替代解决方案是使用专用系统来生成序列(类似于生成主键的数据库)。 The system would essentially be a bulletproof counter. 该系统实质上是防弹计数器。 When something needs an ID, it queries the system for the next available ID. 当某些东西需要一个ID时,它将向系统查询下一个可用ID。 When the system receives a query for a new ID, it increments the counter and supplies the new value. 当系统收到对新ID的查询时,它将增加计数器并提供新值。 It would be arranged such that the act of updating the counter, getting the new value, and storing the current state (against issues such as power failure) is atomic. 这样安排的目的是,更新计数器,获取新值并存储当前状态(针对诸如电源故障之类的问题)的动作是原子性的。

The idea of a counter system may not be practical such as in the case of poorly connected distributed systems. 例如在连接不良的分布式系统的情况下,计数器系统的想法可能不切实际。 This is the main case calling for the need of a UUID: the ability to generate IDs in multiple distinct, unconnected systems with a extraordinarily high probability that no collision will occur. 这是需要UUID的主要情况:在多个不同的,未连接的系统中生成ID的能力非常高,不会发生冲突。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM