为什么 random.shuffle 在 numpy 列表上失败？

Question

I have an array of row vectors, upon which I run random.shuffle :我有一个行向量数组，我在random.shuffle运行random.shuffle ：

#!/usr/bin/env python                                                                                                                                                                                                                                                

import random
import numpy as np

zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5],
                [0.6, 0.7, 0.8, 0.9, 1. ]])

iterations = 100000
f = 0
for _ in range(iterations):
    random.shuffle(zzz)
    if np.array_equal(zzz[0], zzz[1]):
        print(zzz)
        f += 1

print(float(f)/float(iterations))

Between 99.6 and 100% of the time, using random.shuffle on zzz returns a list with the same elements in it, eg :在 99.6% 到 100% 的情况下，在zzz上使用random.shuffle返回一个包含相同元素的列表，例如：

$ ./test.py
...
[[ 0.1  0.2  0.3  0.4  0.5]
 [ 0.1  0.2  0.3  0.4  0.5]]
0.996

Using numpy.random.shuffle appears to pass this test and shuffle row vectors correctly.使用numpy.random.shuffle似乎可以通过此测试并正确调整行向量。 I'm curious to know why random.shuffle fails.我很想知道为什么random.shuffle失败。

Answer 1

If you look at the code of random.shuffle it performs swaps in the following way:如果您查看 random.shuffle 的代码，它会按以下方式执行交换：

x[i], x[j] = x[j], x[i]

which for a numpy.array would fail, without raising any error.对于 numpy.array 会失败，不会引发任何错误。 Example:例子：

>>> zzz[1], zzz[0] = zzz[0], zzz[1]
>>> zzz
array([[0.1, 0.2, 0.3, 0.4, 0.5],
       [0.1, 0.2, 0.3, 0.4, 0.5]])

The reason is that Python first evaluates the right hand side completely and then make the assignment (this is why with Python single line swap is possible) but for a numpy array this is not True.原因是 Python 首先完全评估右侧，然后进行赋值（这就是为什么 Python 单行交换是可能的）但对于 numpy 数组，这不是真的。

numpy麻木的

>>> arr = np.array([[1],[1]])
>>> arr[0], arr[1] = arr[0]+1, arr[0]
>>> arr
array([[2],
       [2]])

Python Python

>>> l = [1,1]
>>> l[0], l[1] = l[0]+1, l[0]
>>> l
[2, 1]

Answer 2

Try it like this :像这样尝试：

#!/usr/bin/env python                                                                                                                                                                                                                                                

import random
import numpy as np

zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5],
                [0.6, 0.7, 0.8, 0.9, 1. ]])

iterations = 100000
f = 0
for _ in range(iterations):
    random.shuffle(zzz[0])
    random.shuffle(zzz[1])
    if np.array_equal(zzz[0], zzz[1]):
        print(zzz)
        f += 1

print(float(f)/float(iterations))

Answer 3

In [200]: zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5], 
     ...:                 [0.6, 0.7, 0.8, 0.9, 1. ]]) 
     ...:                                                                                      
In [201]: zl = zzz.tolist()                                                                    
In [202]: zl                                                                                   
Out[202]: [[0.1, 0.2, 0.3, 0.4, 0.5], [0.6, 0.7, 0.8, 0.9, 1.0]]

random.random is probably using an in-place assignment like: random.random可能正在使用就地分配，例如：

In [203]: zzz[0],zzz[1]=zzz[1],zzz[0]                                                          
In [204]: zzz                                                                                  
Out[204]: 
array([[0.6, 0.7, 0.8, 0.9, 1. ],
       [0.6, 0.7, 0.8, 0.9, 1. ]])

Note the replication.注意复制。

But applied to a list of lists:但应用于列表列表：

In [205]: zl[0],zl[1]=zl[1],zl[0]                                                              
In [206]: zl                                                                                   
Out[206]: [[0.6, 0.7, 0.8, 0.9, 1.0], [0.1, 0.2, 0.3, 0.4, 0.5]]
In [207]: zl[0],zl[1]=zl[1],zl[0]                                                              
In [208]: zl                                                                                   
Out[208]: [[0.1, 0.2, 0.3, 0.4, 0.5], [0.6, 0.7, 0.8, 0.9, 1.0]]

I tested zl = list(zzz) and still got the array behavior.我测试了zl = list(zzz)并且仍然得到了数组行为。 This zl is a list with views of zzz .这个zl是一个包含zzz视图的列表。 tolist makes a list of lists that s totally independent of zzz`. tolist一个s totally independent of zzz` 的列表列表。

In short random.random cannot handle inplace modifications of a ndarray correctly.总之random.random不能正确处理ndarray就地修改。 np.random.shuffle is designed to work with the 1st dim of an array, so it gets it right. np.random.shuffle旨在处理数组的第一个np.random.shuffle ，因此它是正确的。

correct assignment for ndarray is: ndarray正确分配是：

In [211]: zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5], 
     ...:                 [0.6, 0.7, 0.8, 0.9, 1. ]]) 
     ...:                                                                                      
In [212]: zzz[[0,1]] = zzz[[1,0]]                                                              
In [213]: zzz                                                                                  
Out[213]: 
array([[0.6, 0.7, 0.8, 0.9, 1. ],
       [0.1, 0.2, 0.3, 0.4, 0.5]])
In [214]: zzz[[0,1]] = zzz[[1,0]]                                                              
In [215]: zzz                                                                                  
Out[215]: 
array([[0.1, 0.2, 0.3, 0.4, 0.5],
       [0.6, 0.7, 0.8, 0.9, 1. ]])

为什么 random.shuffle 在 numpy 列表上失败？

问题描述

3 个解决方案

解决方案1
4 已采纳 2020-02-10 00:54:28

解决方案2
0 2020-02-10 00:40:01

解决方案3
0 2020-02-10 00:59:56

为什么 random.shuffle 在 numpy 列表上失败？

问题描述

3 个解决方案

解决方案1 4 已采纳 2020-02-10 00:54:28

解决方案2 0 2020-02-10 00:40:01

解决方案3 0 2020-02-10 00:59:56

解决方案1
4 已采纳 2020-02-10 00:54:28

解决方案2
0 2020-02-10 00:40:01

解决方案3
0 2020-02-10 00:59:56