简体   繁体   English

从1-50的生成器生成1-100的随机数

[英]Generate random numbers from 1-100 from a generator of 1-50

in a recent interview i was asked the following question: 在最近的一次采访中,我被问到以下问题:

Print random numbers from 1-100 using the given getrnd50() method which generates the random numbers from 1-50. 使用给定的getrnd50()方法打印1-100中的随机数,该方法生成1-50的随机数。 Each random number should be printed only once and in random order. 每个随机数应仅以随机顺序打印一次。 Use of no other random number generator is allowed and i was not allowed to change the definition of getrnd50() . 不允许使用其他随机数生成器,我不允许更改getrnd50()的定义。

I came up with the following code which gives the correct output. 我想出了以下代码,它给出了正确的输出。

import java.util.Random;

public class Test {

public static void main(String[] args) {
    int[] rs = new int[100];
    int count = 0;
    int k;
    while (count != 100) {

        // I decided to simply multiply the result of `getrnd50()` by 2. 
        // But since that would generate only even numbers,

        k = getrnd50() * 2;

        // I decided to randomly subtract 1 from the numbers. 
        // Which i accomlished as follows.          

        if (getrnd50() <= 25) // 25 is to half the possibilities.
            k--;

        // Every number is to be stored in its own unique location 
        // in the array `rs`, the location is `(number-1)`. 
        // Every time a number is generated it is checked whether it has 
        // already been generated. If not, it is saved in its position, printed and 
        // `count` is incremented.

        if (rs[k-1] == 0) {
            rs[k-1] = k;
            count++;
            System.out.print(k + " ");
        }
    }
}
// This is the method and i am not supposed to touch it.
static int getrnd50() {
    Random rand = new Random();
    return (1 + rand.nextInt(50));
}

}

While it was accepted in that round, in the next round the interviewer tells me that getrnd50() is a costly method and even in best case scenario i have to call it twice for every number generated. 虽然它在那一轮被接受,但在下一轮中,面试官告诉我getrnd50()是一种昂贵的方法,即使在最好的情况下,我必须为每个生成的数字调用它两次。 ie 200 times for 1-100. 即1-100次200次。 In worst case scenario it would be infinity and tens of thousand in average case. 在最坏的情况下,它将是无限的,平均情况下数万。 He asks me to optimize the code so as to significantly improve the average case. 他要求我优化代码,以便显着改善平均情况。

He gave me a hint when i expressed my inability to do it, he said: 当我表示无法做到时,他给了我一个暗示,他说:

To consider the number of numbers generated while generating a new number. 考虑生成新数字时生成的数字的数量。 For ex. 对于前者 if count becomes 99 i don't have to call getrnd50() I can simply find the remaining number and print it. 如果count变为99,我不必调用getrnd50()我可以简单地找到剩余的数字并打印出来。

While i understood his drift i had no idea how it would help me, so obviously i got rejected. 虽然我理解他的漂移我不知道它会如何帮助我,所以显然我被拒绝了。 Now i am curious to know the answer. 现在我很想知道答案。 Help me! 帮我! Thanx in advance! Thanx提前!

Note: if anyone is feeling lazy to to write a lengthy code just point out the numer generation part, the rest is easy. 注意:如果有人懒得写一个冗长的代码只是指出数字生成部分,其余的很容易。 Also we are not bound to follow the hint. 此外,我们不一定遵循提示。

The key is to not check if you have generated the number before, which gets very expensive when looking for only 1 remaining number, but to generate the numbers 1-100 in order, and then shuffle. 关键是不检查你之前是否已经生成了这个数字,当只查找剩余数字时会变得非常昂贵,但要按顺序生成数字1-100,然后进行随机播放。

In your code, when you have generated 99 out of the 100 numbers, you will loop around, generating random numbers, until you find that 1 remaining number. 在您的代码中,当您生成100个数字中的99个时,您将循环生成随机数,直到找到剩余的1个数字。 That's why the average case in your version is so bad. 这就是为什么你的版本中的平均情况如此糟糕。

If instead you just shuffle an array, you only need to have as many random numbers as you have shuffle operations, and only as many shuffle operations as you need numbers output. 相反,如果您只是随机播放一个数组,您只需要拥有与随机数一样多的随机数,并且只需要与您需要数字输出一样多的随机操作。

(For full details on shuffling, look up the Fisher-Yates shuffle, specifically the inside-out variant which can generate a shuffled array in place) (有关改组的详细信息,请查看Fisher-Yates shuffle,特别是可以生成洗牌阵列的内向外变体)

To generate the random numbers, you need a variable generator, rather than a fixed 1-50 one. 要生成随机数,您需要一个变量生成器,而不是固定的1-50。 You can approach this in a variety of ways, but be very careful of introducing skew into the results, if you really want the output to have a good distribution across the possible states. 您可以通过各种方式处理此问题,但如果您确实希望输出在可能的状态中具有良好的分布,请非常小心地将偏差引入结果中。

For example, I would recommend using an integral number of bits, with shifting, rather than attempting to use a modulo. 例如,我建议使用整数位,并使用移位,而不是尝试使用模数。 This does involve a certain amount of looping if the values are outside of the desired range, but without being able to modify the original random number generation, your hands are somewhat tied. 如果值超出所需范围,这确实涉及一定量的循环,但是如果不能修改原始随机数生成,则您的手有点束缚。

static int bits = 0;
static int r100 = 0;

static int randomInt(int range)
{
    int ret;

    int bitsneeded = 32 - Integer.numberOfLeadingZeros(range - 1);

    do {
            while(bits < bitsneeded)
            {
                    int r = (getrnd50()-1) * 50 + getrnd50()-1;
                    if(r < 2048)
                    {
                            r100 <<= 11;
                            r100 |= r;
                            bits += 11;
                    }
            }
            ret = r100 & ((1 << bitsneeded) - 1);
            bits -= bitsneeded;
            r100 >>=  bitsneeded;
    } while(ret >= range); 

        return ret + 1;
}

This implementation will use something in the region of 150 random numbers for your 100 value shuffled array. 此实现将使用150个随机数区域中的某个值为您的100个值混洗数组。 This is worse than the modulo version, but better than 2x the input range, which was the best case of the original version. 这比模数版本更差,但优于输入范围的2倍,这是原始版本的最佳情况。 There is, if the random generation was truly random, still a worst-case scenario of infinity, but random generation doesn't typically work like that. 如果随机生成是真正随机的,那么仍然是无穷大的最坏情况,但随机生成通常不会那样工作。 If it did, I'm not sure unskewed results are realistic given the constraints. 如果确实如此,我不确定在给定约束的情况下,未经证实的结果是否真实。

For illustration, as the results are subtle, here's a graph of my suggested random routine, versus a modulo version: 为了说明,由于结果很微妙,这里是我建议的随机例程与模数版本的关系图:

随机发电机图

So in summary, I think that while your random generation is a bit inefficient, and could be improved, the really big win that interviewer was looking for, is in not needing so many random numbers in the first place, by doing a shuffle rather than repeated searching with an ever decreasing probability. 总而言之,我认为虽然你的随机生成效率有点低,并且可以得到改善,但是面试官正在寻找的真正大赢家,首先不需要这么多随机数,通过洗牌而不是以不断下降的概率重复搜索。

Since 100 / 50 is an integer, this is quite easy. 由于100/50是一个整数,这很容易。 Since 50 / (100 / 50) is an integer, it's even easier. 由于50 /(100/50)是一个整数,因此更容易。

If you didn't quite get that, here is some sample code: 如果你没有那么做,这里有一些示例代码:

int rnd1 = getrnd50();
int rnd2 = getrnd50();
if (rnd1 % 2 == 0)
{
    rnd2 += 50;
}
return rnd2;

Here is an outline: 这是一个大纲:

  • Two numbers, chosen randomly between 1 and 50, called a and b . 两个数字,在1到50之间随机选择,称为ab
  • If a is even, add 50 to b . 如果a是偶数,则将b加50。
  • Return b . 返回b

You can make this a one-liner if you want: 如果您愿意,可以将其设为单行:

return getrnd50() + getrnd50() % 2 * 50;

That's a little too obfuscated though. 但这有点太混淆了。

Edit: I see the question was really asking for a shuffled list, not a sequence of random integers. 编辑:我看到问题实际上是要求一个混乱的列表,而不是一系列随机整数。

This can be done by creating a list from 1 to 100, and doing 100 random swaps, like a Fisher-Yates shuffle. 这可以通过创建1到100的列表,并进行100次随机交换来完成,例如Fisher-Yates shuffle。 I imagine that with a Fisher-Yates shuffle, the absolute minimum number of calls is 93 (given with the formula ceil(log50(100!)) ), but with a much simpler algorithm you can use 200. 我想通过Fisher-Yates shuffle,绝对最小的调用次数是93(用公式ceil(log50(100!)) )给出,但是使用更简单的算法可以使用200。

The simple algorithm would involve swapping each of the 100 elements with a random element from the 100. The number to choose would be generated from 1-100 with the above generator. 简单算法将涉及使用来自100的随机元素交换100个元素中的每一个。要使用上述生成器从1-100生成要选择的数字。

For example: 例如:

for (int i = 0; i < 100; i++)
{
    swap(i, getrnd100() - 1); // - 1 for zero base!
}

Here is some complete code: 这是一些完整的代码:

int[] result = new int[100];
for (int i = 0; i < 100; i++)
{
    result[i] = i + 1;
}
for (int i = 0; i < 100; i++)
{
    int j = (getrnd50() + getrnd50() % 2 * 50) - 1;
    int tmp = result[i];
    result[i] = result[j];
    result[j] = tmp;
}
return result;

(Disclaimer: I don't know Java, and I haven't tested it.) (免责声明:我不懂Java,我还没有测试过。)

Best case 200, worst case 200, average case 200. 最佳案例200,最差案例200,平均案例200。

Here is how you could answer it. 这是你如何回答它。 It exploits the fact that, 它利用了这样一个事实:

  • assuming you are using shuffle to get an O(n) swapping of "cards", the modulus decreases in a shuffle. 假设您正在使用shuffle来获得“卡”的O(n)交换,则模数会随着时间的推移而减少。 ie start with an int[] of every values and shuffle it like Collections.shuffle() does. 即以每个值的int[]开头并像Collections.shuffle()那样将其洗牌。
  • you have more randomness than you need if you call getrnd50() twice, esp when you have less than 50 values left to swap with. 如果你两次调用getrnd50(),你有更多的随机性,尤其是当你交换的值少于50时。

EDIT: For those not familar with how shuffle works, I have added the code for shuffling 编辑:对于那些不熟悉shuffle工作的人,我已经添加了改组代码

import java.util.*;
import java.lang.*;

class Main {
    public static void main(String... args) {
        int samples = 100;

        // all the numbers [1, 100]
        int[] nums = new int[samples];
        for (int i = 0; i < samples; i++) nums[i] = i + 1;

        for (int i = samples - 1; i > 0; i--) {
            int swapWith = nextInt(i + 1);

            // swap nums[i] and nums[swapWith]
            if (swapWith == i) continue;
            int tmp = nums[swapWith];
            nums[swapWith] = nums[i];
            nums[i] = tmp;
        }
        System.out.println("calls/sample " + (double) calls / samples);
        System.out.println(Arrays.toString(nums));

        int[] count49 = new int[49];
        for (int i = 0; i < 49 * 10000; i++)
            count49[nextInt(49) - 1]++;
        int[] count54 = new int[54];
        for (int i = 0; i < 54 * 10000; i++)
            count54[nextInt(54) - 1]++;
        System.out.println("Histogram check (49): " + Arrays.toString(count49));
        System.out.println("Histogram check (54): " + Arrays.toString(count54));

    }

    // keep track of the range of values.
    static int maxRandom = 1;
    // some random value [0, maxRandom)
    static int rand100 = 0;

    static int nextInt(int n) {
        while (maxRandom < 10 * n * n) {
            maxRandom *= 50;
            rand100 = rand100 * 50 + getrnd50() - 1;
        }
        int ret = rand100 % n;
        maxRandom = (maxRandom + n - 1) / n;
        rand100 /= n;
        return ret + 1;
    }

    static final Random rand = new Random();
    static int calls = 0;

    static int getrnd50() {
        calls++;
        return (1 + rand.nextInt(50));
    }
}

prints 版画

calls/sample 0.94 来电/样品0.94

[1, 37, 4, 98, 76, 53, 26, 55, 9, 78, 57, 58, 47, 12, 44, 25, 82, 2, 42, 30, 88, 81, 64, 99, 16, 28, 34, 29, 51, 36, 13, 94, 80, 66, 19, 38, 20, 8, 40, 89, 72, 56, 75, 96, 35, 100, 95, 17, 74, 69, 11, 31, 86, 92, 6, 27, 22, 70, 63, 32, 93, 84, 71, 15, 23, 5, 14, 62, 49, 43, 87, 65, 83, 33, 45, 52, 39, 91, 60, 73, 68, 24, 97, 46, 50, 18, 79, 48, 77, 67, 59, 10, 7, 54, 90, 85, 21, 61, 41, 3] [1,37,4,98,76,53,26,55,9,78,57,58,47,12,44,25,82,2,42,30,88,81,64,99,16 ,28,34,29,51,36,13,94,80,66,19,38,20,8,40,89,72,56,75,96,35,100,95,17,74,69 ,11,31,86,92,6,27,22,70,63,32,93,84,71,15,23,5,14,62,49,43,87,65,83,33,45 ,52,39,91,60,73,68,24,97,46,50,18,79,48,77,67,59,10,7,54,90,85,21,61,41,3 ]

Histogram check (49): [10117, 10158, 10059, 10188, 10338, 9959, 10313, 10278, 10166, 9828, 10105, 10159, 10250, 10152, 9949, 9855, 10026, 10040, 9982, 10112, 10021, 10082, 10029, 10052, 9996, 10057, 9849, 9990, 9914, 9835, 10029, 9738, 9953, 9828, 9896, 9931, 9995, 10034, 10067, 9745, 9873, 9903, 9913, 9841, 9823, 9859, 9941, 10007, 9765] 直方图检查(49):[10117,10158,10059,10188,10338,9959,10313,10278,10166,9828,10105,10159,10250,10152,9949,9855,10026,10040,9982,10112,10021,10082 ,10029,10052,9996,10057,9849,9990,9914,9835,10029,9738,9953,9828,9896,9931,9995,10034,10067,9745,9873,9903,9913,9841,9823,9859,9941 ,10007,9765]

Histogram check (54): [10124, 10251, 10071, 10020, 10196, 10170, 10123, 10096, 9966, 10225, 10262, 10036, 10029, 9862, 9994, 9960, 10070, 10127, 10021, 10166, 10077, 9983, 10118, 10163, 9986, 9988, 10008, 9965, 9967, 9950, 9965, 9870, 10172, 9952, 9972, 9828, 9754, 10152, 9943, 9996, 9779, 10014, 9937, 9931, 9794, 9708, 9978, 9894, 9803, 9904, 9915, 9927, 10000, 9838] 直方图检查(54):[10124,10251,10071,10020,10196,10170,10123,10096,9966,10225,10262,10036,10029,9862,9994,9960,10070,10127,10021,10166,10077,9983 ,10118,10163,9986,9988,10008,9965,9967,9950,9965,9870,10172,9952,9972,9828,9754,10152,9943,9996,9779,10014,9937,9931,9794,9708,9978 ,9894,9803,9904,9915,9927,10000,9838]

In this case, 100 numbers need less than 100 calls to getrnd50 在这种情况下,100个号码需要少于100次调用getrnd50

If you have 1000 values to shuffle 如果您有1000个值要随机播放

calls/sample 1.509

The performance penalty of your code is in that line 代码的性能损失就在那一行

if (getrnd50() <= 25)

You need to find a way to get more information out of that single generated random number, otherwise you are wasting those costly generated resources. 您需要找到一种方法从该单个生成的随机数中获取更多信息,否则您将浪费那些昂贵的生成资源。 Here is my proposal for that: 以下是我的建议:

First imagine that we would have a random number generator for the numbers 0-15. 首先想象一下,对于数字0-15,我们会有一个随机数生成器。 Every number can be represented as a path in a binary tree where the leafs represent the numbers. 每个数字都可以表示为二叉树中的路径,其中叶子表示数字。 So we can say that we evaluate that if condition to true every time we walk left in the tree when starting at the root. 所以我们可以说,当我们从根处开始向左走时,我们会评估条件是否为true

The problem is that the random number generator generates numbers in an interval that doesn't end at a power of two. 问题是随机数发生器在一个不以2的幂结束的间隔中产生数字。 So we need to expand that tree. 所以我们需要扩展那棵树。 This is done like so: 这样做是这样的:
If the random number is in the range 0-31 we are fine with a tree for those numbers. 如果随机数在0-31范围内,我们可以使用这些数字的树。 If it is in the range 32-47 we use the tree from 0-15 for those and in the case 48-49 we use a tree for the numbers 0-1. 如果它在32-47范围内,那么我们使用0-15的树作为那些树,在48-49的情况​​下,我们使用树来表示数字0-1。

So in the worst case, we aren't using much more information from that random number, but in the most cases we are. 所以在最坏的情况下,我们并没有使用来自该随机数的更多信息,但在大多数情况下我们都是。 So this should significantly improve the average case. 所以这应该会显着改善平均情况。

Ok, so you are allowed to print the last missing number of anny set of n numbers without it being generated by the random number generator? 好的,所以你可以打印最后一个缺失的n个n数集,而不是由随机数生成器生成的?

If so you could make use of recursion and decrease the size of the set with each call, until you only have n=2 and then call getrnd50() once. 如果是这样,你可以使用递归并减少每次调用的集合大小,直到你只有n = 2,然后调用getrnd50()一次。 when you go back recursively, simply print the missing number on each set. 当你以递归方式返回时,只需在每组上打印缺失的数字。

 List<Integer> lint ; 

  public void init(){
      random = new Random();
      lint = new LinkedList<>();
      for(int i = 1 ; i < 101; ++i) {
          lint.add(i); // for truly random results, this needs to be randomized.
      }
  }

  Random random ; 
  public int getRnd50() {
      return random.nextInt(50) + 1;
  }

  public int getRnd100(){

      int value = 0;
      if (lint.size() > 1) {
          int index = getRnd50()%lint.size();
          value = lint.remove(index); 
      } else if (lint.size() == 1 ) {
          value = lint.remove(0);
      }      

      return value;      
  }

Call getRnd50() exactly 99 times. 将getRnd50()调用99次。 Its not truly random though since numbers stored in List of 100 integers are in sequence. 它不是真正随机的,因为存储在100个整数的List中的数字是顺序的。

(1) Create an array A initialized with {1,...,100}. (1)创建一个用{1,...,100}初始化的数组A. Keep a variable 'length' of this array. 保持此数组的变量“长度”。

(2) Create a random method to randomly generate a number from 1 to length. (2)创建一个随机方法,从1到长度随机生成一个数字。 Each call of this method will call getrnd50() for no more than 2. Call the returned value as 'index'. 每次调用此方法都会调用getrnd50()不超过2.将返回值称为'index'。

(3) Output A[index], swap A[length] to A[index] and length--. (3)输出A [索引],交换A [长度]到A [索引]和长度 - 。

(4) Repeat (1)-(3) until the array is empty. (4)重复(1) - (3)直到数组为空。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM