简体   繁体   English

为什么 random.shuffle 在 numpy 列表上失败?

[英]Why does random.shuffle fail on numpy lists?

I have an array of row vectors, upon which I run random.shuffle :我有一个行向量数组,我在random.shuffle运行random.shuffle

#!/usr/bin/env python                                                                                                                                                                                                                                                

import random
import numpy as np

zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5],
                [0.6, 0.7, 0.8, 0.9, 1. ]])

iterations = 100000
f = 0
for _ in range(iterations):
    random.shuffle(zzz)
    if np.array_equal(zzz[0], zzz[1]):
        print(zzz)
        f += 1

print(float(f)/float(iterations))

Between 99.6 and 100% of the time, using random.shuffle on zzz returns a list with the same elements in it, eg :在 99.6% 到 100% 的情况下,在zzz上使用random.shuffle返回一个包含相同元素的列表,例如

$ ./test.py
...
[[ 0.1  0.2  0.3  0.4  0.5]
 [ 0.1  0.2  0.3  0.4  0.5]]
0.996

Using numpy.random.shuffle appears to pass this test and shuffle row vectors correctly.使用numpy.random.shuffle似乎可以通过此测试并正确调整行向量。 I'm curious to know why random.shuffle fails.我很想知道为什么random.shuffle失败。

If you look at the code of random.shuffle it performs swaps in the following way:如果您查看 random.shuffle 的代码,它会按以下方式执行交换:

x[i], x[j] = x[j], x[i]

which for a numpy.array would fail, without raising any error.对于 numpy.array 会失败,不会引发任何错误。 Example:例子:

>>> zzz[1], zzz[0] = zzz[0], zzz[1]
>>> zzz
array([[0.1, 0.2, 0.3, 0.4, 0.5],
       [0.1, 0.2, 0.3, 0.4, 0.5]])

The reason is that Python first evaluates the right hand side completely and then make the assignment (this is why with Python single line swap is possible) but for a numpy array this is not True.原因是 Python 首先完全评估右侧,然后进行赋值(这就是为什么 Python 单行交换是可能的)但对于 numpy 数组,这不是真的。

numpy麻木的

>>> arr = np.array([[1],[1]])
>>> arr[0], arr[1] = arr[0]+1, arr[0]
>>> arr
array([[2],
       [2]])

Python Python

>>> l = [1,1]
>>> l[0], l[1] = l[0]+1, l[0]
>>> l
[2, 1]

Try it like this :像这样尝试:

#!/usr/bin/env python                                                                                                                                                                                                                                                

import random
import numpy as np

zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5],
                [0.6, 0.7, 0.8, 0.9, 1. ]])

iterations = 100000
f = 0
for _ in range(iterations):
    random.shuffle(zzz[0])
    random.shuffle(zzz[1])
    if np.array_equal(zzz[0], zzz[1]):
        print(zzz)
        f += 1

print(float(f)/float(iterations))
In [200]: zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5], 
     ...:                 [0.6, 0.7, 0.8, 0.9, 1. ]]) 
     ...:                                                                                      
In [201]: zl = zzz.tolist()                                                                    
In [202]: zl                                                                                   
Out[202]: [[0.1, 0.2, 0.3, 0.4, 0.5], [0.6, 0.7, 0.8, 0.9, 1.0]]

random.random is probably using an in-place assignment like: random.random可能正在使用就地分配,例如:

In [203]: zzz[0],zzz[1]=zzz[1],zzz[0]                                                          
In [204]: zzz                                                                                  
Out[204]: 
array([[0.6, 0.7, 0.8, 0.9, 1. ],
       [0.6, 0.7, 0.8, 0.9, 1. ]])

Note the replication.注意复制。

But applied to a list of lists:但应用于列表列表:

In [205]: zl[0],zl[1]=zl[1],zl[0]                                                              
In [206]: zl                                                                                   
Out[206]: [[0.6, 0.7, 0.8, 0.9, 1.0], [0.1, 0.2, 0.3, 0.4, 0.5]]
In [207]: zl[0],zl[1]=zl[1],zl[0]                                                              
In [208]: zl                                                                                   
Out[208]: [[0.1, 0.2, 0.3, 0.4, 0.5], [0.6, 0.7, 0.8, 0.9, 1.0]]

I tested zl = list(zzz) and still got the array behavior.我测试了zl = list(zzz)并且仍然得到了数组行为。 This zl is a list with views of zzz .这个zl是一个包含zzz视图的列表。 tolist makes a list of lists that s totally independent of zzz`. tolist一个s totally independent of zzz` 的列表列表。

In short random.random cannot handle inplace modifications of a ndarray correctly.总之random.random不能正确处理ndarray就地修改。 np.random.shuffle is designed to work with the 1st dim of an array, so it gets it right. np.random.shuffle旨在处理数组的第一个np.random.shuffle ,因此它是正确的。

correct assignment for ndarray is: ndarray正确分配是:

In [211]: zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5], 
     ...:                 [0.6, 0.7, 0.8, 0.9, 1. ]]) 
     ...:                                                                                      
In [212]: zzz[[0,1]] = zzz[[1,0]]                                                              
In [213]: zzz                                                                                  
Out[213]: 
array([[0.6, 0.7, 0.8, 0.9, 1. ],
       [0.1, 0.2, 0.3, 0.4, 0.5]])
In [214]: zzz[[0,1]] = zzz[[1,0]]                                                              
In [215]: zzz                                                                                  
Out[215]: 
array([[0.1, 0.2, 0.3, 0.4, 0.5],
       [0.6, 0.7, 0.8, 0.9, 1. ]])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM