简体   繁体   中英

How does this shuffling with Math rand work?

I saw this code to shuffle a list:

public static void shuffle(List<Integer> numbers) {
        if(numbers == null || numbers.isEmpty()) return;
        for(int i = 0; i < numbers.size(); ++i) {
            int index = (int) (i + Math.random()*(numbers.size() - i));
            swap(numbers, i, index);
        }
    }

The code seem to work but I don't understand this snippet:

            int index = (int) (i + Math.random()*(numbers.size() - i));

Basically it is i + R*(ni) but how does this ensure that: i) we won't get an out of bounds index or ii) I won't be changing the same element's ie index == i and the shuffle would not be that random?

Math.random() returns a uniform random number in the interval [0, 1) , and numbers.size() - i , ideally, scales that number to the interval [0, numbers.size() - i) . For example, if i is 2 and the size of the list is 5, a random number in the interval [0, 3) is chosen this way, in the ideal case. Finally, i is added to the number and the (int) cast discards the number's fractional part. Thus, in this example, a random integer in [2, 5) (that is, either 2, 3, or 4) is generated at random, so that at each iteration, the number at index X swaps with itself or a number that follows it.

However, there is an important subtlety here. Due to the nature of floating-point numbers and rounding error when scaling the number, in extremely rare cases the output of Math.random()*(numbers.size() - i) might be equal to numbers.size() - i , even if Math.random() outputs a number that excludes 1. rounding error can cause the idiom Math.random()*(numbers.size() - i) to bias some results over others. For example, this happens whenever 2^53 is not divisible by numbers.size() - i , since Math.random() uses java.util.Random under the hood, and its algorithm generates numbers with 53 bits of precision. Because of this, Math.random() is not the best way to write this code, and the code could have used a method specially made for generating random integers instead (such as the nextInt method of java.util.Random ). See also this question and this question .

EDIT: As it turns out, the Math.random() * integer idiom does not produce the issue that it may return integer , at least when integer is any positive int and the round-to-nearest rounding mode is used as in Java. See this question .

  1. You have a list of 1 to 50 ints.

  2. So get a random value from 0 to 49 inclusive to index it. say it is 30.

  3. Get item at index 30.

  4. Now replace item at index 30 with item at index 49.

  5. Next time generate a number between 0 and 48 inclusive. 49 will never be reached and the number that was there occupies the slot of the last number used.

  6. Continue this process until you've exhausted the list.

Note: that the expression (int)(Math.random() * n) will generate a random number between 0 and n-1 inclusive because Math.random generates a number between 0 and 1 exclusive.

Math.random() always returns a floating-point number between 0 (inclusive) and 1 (exclusive). So when you do Math.random()*(numbers.size() - i) , the result will always be between 0 (inclusive) and ni (exclusive).

Then you add i to it in i + Math.random()*(numbers.size() - i) .

Now the result, as you can see, will be between i (inclusive) and n (exclusive).

After that, you are casting it to an int. When you cast a double to an int, you truncate it, so now the value of index will somewhere from ``i to n - 1``` (inclusive for both).

Therefore, you will not have an ArrayIndexOutOfBoundsException, since it will always be at least 1 less than the size of the array.

However, the value of index could be equal to i, so yes, you are right in that a number could be swapped with itself and stay right there. That's perfectly fine.

Instead of using such a custom method, I recommend you use OOTB Collections.shuffle . Check this to understand the logic implemented for Collections.shuffle .

Analysis of your code:

Math.random() returns a double value with a positive sign, greater than or equal to 0.0 and less than 1.0 .

Now, let's assume numbers.size() = 5 and dry run the for loop:

When i = 0, index = (int) (0 + Math.random()*(5 - 0)) = (int) (0 + 4.x) = 4
When i = 1, index = (int) (1 + Math.random()*(5 - 1)) = (int) (1 + 3.x) = 4
When i = 2, index = (int) (2 + Math.random()*(5 - 2)) = (int) (2 + 2.x) = 4
When i = 3, index = (int) (3 + Math.random()*(5 - 3)) = (int) (3 + 1.x) = 4
When i = 4, index = (int) (4 + Math.random()*(5 - 4)) = (int) (4 + 0.x) = 4

As you can see, the value of index will remain 4 in each iteration when numbers.size() = 5 .

Your queries:

how does this ensure that: i) we won't get an out of bounds index

As already explained above using the dry run, it will never go out of bounds.

or ii) I won't be changing the same element's ie index == i and the shuffle would not be that random?

swap(numbers, i, index); is swapping the element at index, i with the element at index, 4 each time when numbers.size() = 5 . This is illustrated with the following example:

Let's say numbers = [1, 2, 3, 4, 5]

When i = 0, numbers will become [5, 2, 3, 4, 1]
When i = 1, numbers will become [5, 1, 3, 4, 2]
When i = 2, numbers will become [5, 1, 2, 4, 3]
When i = 3, numbers will become [5, 1, 2, 3, 4]
When i = 4, numbers will become [5, 1, 2, 3, 4]
  1. int index = (int) (i + Math.random()*(numbers.size() - i)); - it is important to note that Math.random() will generate a number which belongs to <0;1). So it will never exceed the boundry as exclusive max will be: i + 1*(number.size() -i) = number.size
  2. This point is valid, it can happen.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM