简体   繁体   中英

How to create a unique ID in a Python Class across multiple processes

I want to write an object that, upon instantiation, generates a new ID for each instance. This ID however must be

  • Generated in a thread and process safe manner
  • Unique even across processes (spawned with multiprocessing)

A few non-concerns:

  • This particular object creation is not performance critical so the synchronization overhead posed by this is acceptable.
  • IDs must not be serial, though a clean solution usually comes with that.
  • We are ignorant enough to not care about python 2 at all.

There are already some solutions which work only in one process , the most elegant being the use of an itertools.count() object. Using id() is no option as it is not guaranteed to be unique. The ideal solution would probably to be a similar object to itertools.count() which holds some static global value across processes.

Related discussion on our project: https://github.com/coala-analyzer/coala/issues/981

As suggested by @VPfB, use UUID . UUID is an acronym for Universally Unique Identifier . Technically, the ids can only be as unique as the available bit space used to store them. Traditionally UUID are 128 bits. The Wikipedia article on the topic discusses their uniqueness :

To put these numbers into perspective, the annual risk of a given person being hit by a meteorite is estimated to be one chance in 17 billion, which means the probability is about 0.00000000006 (6 × 10−11), equivalent to the odds of creating a few tens of trillions of UUIDs in a year and having one duplicate. In other words, only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%.

An alternate solution is to use a dedicated system to generate a sequence (similar to a database generating a primary key). The system would essentially be a bulletproof counter. When something needs an ID, it queries the system for the next available ID. When the system receives a query for a new ID, it increments the counter and supplies the new value. It would be arranged such that the act of updating the counter, getting the new value, and storing the current state (against issues such as power failure) is atomic.

The idea of a counter system may not be practical such as in the case of poorly connected distributed systems. This is the main case calling for the need of a UUID: the ability to generate IDs in multiple distinct, unconnected systems with a extraordinarily high probability that no collision will occur.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM