简体   繁体   English

12位唯一分布式随机数发生器

[英]12 digit unique distributed random number generator

I ported sonyflake to Java and it worked fine.我将sonyflake移植到 Java 并且效果很好。 However, instead of generating 8 digit numbers, I am looking to generate 12 digit unique numbers.但是,我希望生成 12 位唯一数字,而不是生成 8 位数字。 The original port uses 16-bit machineId .原始端口使用 16 位machineId Because we have at least 2 data centers, but not limited to, I added 8-bits for the data center - using the second octet of the IP address.因为我们至少有 2 个数据中心,但不限于,我为数据中心添加了 8 位 - 使用 IP 地址的第二个八位字节。 I tweaked all the settings for the bit lengths, couldn't manage to generate 12-digits numbers.我调整了位长的所有设置,无法生成 12 位数字。 Is there an algorithm inspired by sonyflake or Twitters Snowflake to generate unique 12-digit numbers which uses 16-bit machineId and 8-bit dataCenterId ?是否有受 sonyflake 或 Twitters Snowflake启发的算法来生成使用 16 位machineId和 8 位dataCenterId的唯一 12 位数字?

Note: Due to company policy, I cannot post my original Java port here.注意:由于公司政策,我不能在这里发布我原来的 Java 端口。

EDIT: This is what I came up with.编辑:这就是我想出的。 However, instead of generating 12 digit decimal numbers, it generates 10 or 11 digit numbers.但是,它不会生成 12 位十进制数字,而是生成 10 位或 11 位数字。 What changes can I make to for it to always return a 12-digit decimal number?我可以对其进行哪些更改以使其始终返回 12 位十进制数? I understand I need to change the sequence and recalculate the time.我知道我需要更改sequence并重新计算时间。 However, I currently want to focus on generating a 12-digit decimal number.但是,我目前想专注于生成 12 位十进制数。

public class Demo {

    /// 17 time + 4 dc + 10 machine + 8 sequence

    private static final int BIT_LEN_TIME = 17;
    private static final long BIT_WORKER_ID = 10L;
    private static final long BIT_LEN_DC = 4L;
    private static final long BIT_LEN_SEQUENCE = 8L;

    private static final int MAX_WORKER_ID = (int) (Math.pow(2, BIT_WORKER_ID) - 1);
    private static final long MAX_SEQUENCE = (int) (Math.pow(2, BIT_LEN_SEQUENCE) - 1);

    private static final double FLAKE_TIME_UNIT = 1e7; // nsec, i.e. 10 msec
    private static final double LEN_LIMIT = 1e11;
    private static final int START_SEQ = 0;

    private final ReentrantLock mutex = new ReentrantLock();

    private final Instant startInstant;
    private final long startTime;
    private final long dc;
    private long sequence;
    private long lastElapsedTime;
    private long worker;

    public Demo(Instant startInstant) {
        Objects.requireNonNull(startInstant, "startInstant cannot be null");
        if (startInstant.isBefore(Instant.EPOCH) || startInstant.isAfter(Instant.now())) {
            throw new EverestFlakeException("Base time should be after UNIX EPOCH, or before current time.");
        }

        this.startInstant = startInstant;
        this.startTime = this.toEverestFlakeTime(startInstant);
        this.sequence = START_SEQ;
        this.dc = this.msb(this.getDcId()); // 4 bits at most
        this.worker = this.workerId() & ((1 << BIT_WORKER_ID) - 1); // 10 bits at most
    }

    public long next() {
        long currentElapsedTime = this.currentElapsedTime(this.startTime);

        mutex.lock();
        long time = currentElapsedTime & ((1 << BIT_LEN_TIME) - 1); // 17 bits at most
        if (this.sequence == MAX_SEQUENCE) {
            this.sequence = START_SEQ;
            System.out.println("time = " + time);
            sleepMicro(currentElapsedTime - this.lastElapsedTime);
            time = this.currentElapsedTime(this.startTime) & ((1 << BIT_LEN_TIME) - 1);
            System.out.println("time = " + time);
        } else {
            // less than 15000
            if((currentElapsedTime - this.lastElapsedTime) < 0x3a98) {
                sleepMicro(currentElapsedTime - this.lastElapsedTime);
                time = this.currentElapsedTime(this.startTime) & ((1 << BIT_LEN_TIME) - 1);
            }
            this.sequence += (START_SEQ + 1) & MAX_SEQUENCE;
        }

        long id = (time << BIT_LEN_TIME) |
                (worker << BIT_WORKER_ID) |
                (dc << BIT_LEN_DC) |
                (sequence << BIT_LEN_SEQUENCE);
        id += LEN_LIMIT;
        this.lastElapsedTime = currentElapsedTime;
        mutex.unlock();

        return id;
    }

    private void sleepNano(long sleepTime) {
        try {
            System.out.println("nano sleeping for: " + sleepTime);
            TimeUnit.NANOSECONDS.sleep(sleepTime);
        } catch (Exception e) {
            //
        }
    }

    private void sleepMicro(long sleepTime) {
        try {
            System.out.println("micro sleeping for: " + sleepTime);
            TimeUnit.MICROSECONDS.sleep(sleepTime/100);
        } catch (Exception e) {
            //
        }
    }

    private long toEverestFlakeTime(Instant startInstant) {
        return unixNano(startInstant);
    }

    private long unixNano(Instant startInstant) {
        return NanoClock.systemUTC().nanos(startInstant);
    }

    private long currentElapsedTime(long startTime) {
        return this.toEverestFlakeTime(NanoClock.systemUTC().instant()) - startTime;
    }

    private long msb(long n) {
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        n >>>= 1;
        n += 1;
        return n;
    }

    private int workerId() {
        return new SecureRandom().nextInt(MAX_WORKER_ID);
    }

    private int getDcId() {
        try {
            Socket socket = new Socket();
            socket.connect(new InetSocketAddress("google.com", 80));
            byte[] a = socket.getLocalAddress().getAddress();
            socket.close();
            return Byte.toUnsignedInt(a[1]);
        } catch (Exception e) {
            String message = "Failed to process machine id.";
            throw new EverestFlakeException(message, e);
        }
    }
}

If you mean 12 decimal digits, then you can use a number up to 39 bits (40 bits can represent 13-digit in addition to 12-digit numbers).如果您的意思是 12 位十进制数字,那么您可以使用最多 39 位的数字(40 位可以表示 12 位数字之外的 13 位数字)。

If you take 16 bits for the machine ID, and 8 bits for the data center ID, that leaves only 15 bits for the unique portion of the ID for that machine (so only 32768 unique numbers per machine.) With so few numbers, you can choose to assign the numbers sequentially rather than randomly.如果您为机器 ID 取 16 位,为数据中心 ID 取 8 位,那么该机器的唯一 ID 部分只剩下 15 位(因此每台机器只有 32768 个唯一编号)。可以选择顺序分配数字而不是随机分配。

If you mean 12 hexadecimal (base-16) digits, then the situation improves considerably: 16 bits makes up 4 digits and 8 bits makes up another two, leaving 6 base-16 digits for the unique portion of the ID, or 16,777,216 different numbers (24 bits).如果您的意思是 12 个十六进制(base-16)数字,那么情况会大大改善:16 位组成 4 个数字,8 位组成另外两个,留下 6 个 base-16 数字作为 ID 的唯一部分,或 16,777,216 个不同的数字(24 位)。 With this many numbers, you have several different choices to have each machine assign these numbers.有了这么多号码,您有多种不同的选择让每台机器分配这些号码。 You can do so sequentially, or at random (using java.security.SecureRandom , not java.util.Random ), or using a timestamp with 10 ms resolution, as in Sonyflake.您可以按顺序或随机进行(使用java.security.SecureRandom ,或使用分辨率为 10 毫秒的时间戳,如java.util.Random中那样)。


It appears your question is less about how to generate a 12-digit unique ID than it is about how to format a number to fit in exactly 12 digits.看来您的问题与其说是关于如何生成 12 位唯一 ID,不如说是关于如何格式化数字以恰好适合 12 位数字。 Then you have two options.那么你有两个选择。 Assume you have a 39-bit integer x (less than 2 39 and so less than 10 12 ).假设您有一个 39 位 integer x (小于 2 39因此小于 10 12 )。

  • If you can accept leading zeros in the number, then do the following to format x to a 12-digit number: String.format("%012d", x) .如果您可以接受数字中的前导零,则执行以下操作将x格式化为 12 位数字: String.format("%012d", x)

  • If you can't accept leading zeros in the number, then add 100000000000 (10 11 ) to x .如果您不能接受数字中的前导零,则将 100000000000 (10 11 ) 添加到x Since x is less than 2 39 , which is less than 900000000000, this will result in a 12-digit number.由于x小于 2 39 ,即小于 900000000000,这将产生一个 12 位数字。


You are generating worker IDs at random.您正在随机生成工作人员 ID。 In general, random numbers are not enough by themselves to ensure uniqueness.一般来说,随机数本身不足以确保唯一性。 You need some way to check each worker ID you generate for uniqueness.您需要某种方法来检查您生成的每个工作人员 ID 的唯一性。 Once you do so, each worker/datacenter pair will be unique.这样做后,每个工作人员/数据中心对都将是唯一的。 Thus, each machine is required only to generate a machine-unique number, which will be 25 bits long in your case.因此,每台机器只需要生成一个机器唯一编号,在您的情况下将是 25 位长。 There are several ways to do so:有几种方法可以做到这一点:

  • The simplest way is to generate a random number or time-based number (using all 25 bits in your example) and check the number for uniqueness using a hash set (eg, java.util.HashSet ).最简单的方法是生成一个随机数或基于时间的数字(在您的示例中使用所有 25 位)并使用 hash 集(例如java.util.HashSet )检查数字的唯一性。 If the number was already generated, try again with a new number.如果号码已经生成,请使用新号码重试。 (Instead of a hash table, it may be more memory-efficient to use a bit set (eg, java.util.BitSet ) or a compressed bitmap (eg, a "roaring bitmap").) (而不是 hash 表,使用位集(例如, java.util.BitSet )或压缩的 bitmap (例如)“ro”可能更节省内存。

  • Another way is to use a hash table with random/time-based numbers as keys and sequence IDs as values.另一种方法是使用 hash 表,其中随机/基于时间的数字作为键,序列 ID 作为值。 When a unique number needs to be generated, do the following:当需要生成唯一编号时,请执行以下操作:

    1. Generate X , a random number or time-based number.生成X ,一个随机数或基于时间的数字。
    2. Check whether a key equal to X is found in the hash table.检查hash表中是否找到等于X的键。
    3. If the key does not exist, create a new entry in the table, where the key is X and the value is 0. The unique number is X and a sequence number of 0. Stop.如果key不存在,则在表中新建一个entry,key为X ,value为0。唯一编号为X ,序列号为0。停止。
    4. If the key exists, and the value associated with that key is less than 255, add 1 to the value.如果键存在,并且与该键关联的值小于 255,则将该值加 1。 The unique number is X and a sequence number of that value.唯一编号是X和该值的序列号。 Stop.停止。
    5. If the key exists, and the value associated with that key is 255 or greater, go to step 1.如果键存在,并且与该键关联的值是 255 或更大,则 go 到步骤 1。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM