[英]Why does random.shuffle fail on numpy lists?
I have an array of row vectors, upon which I run random.shuffle
:我有一个行向量数组,我在
random.shuffle
运行random.shuffle
:
#!/usr/bin/env python
import random
import numpy as np
zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5],
[0.6, 0.7, 0.8, 0.9, 1. ]])
iterations = 100000
f = 0
for _ in range(iterations):
random.shuffle(zzz)
if np.array_equal(zzz[0], zzz[1]):
print(zzz)
f += 1
print(float(f)/float(iterations))
Between 99.6 and 100% of the time, using random.shuffle
on zzz
returns a list with the same elements in it, eg :在 99.6% 到 100% 的情况下,在
zzz
上使用random.shuffle
返回一个包含相同元素的列表,例如:
$ ./test.py
...
[[ 0.1 0.2 0.3 0.4 0.5]
[ 0.1 0.2 0.3 0.4 0.5]]
0.996
Using numpy.random.shuffle
appears to pass this test and shuffle row vectors correctly.使用
numpy.random.shuffle
似乎可以通过此测试并正确调整行向量。 I'm curious to know why random.shuffle
fails.我很想知道为什么
random.shuffle
失败。
If you look at the code of random.shuffle it performs swaps in the following way:如果您查看 random.shuffle 的代码,它会按以下方式执行交换:
x[i], x[j] = x[j], x[i]
which for a numpy.array would fail, without raising any error.对于 numpy.array 会失败,不会引发任何错误。 Example:
例子:
>>> zzz[1], zzz[0] = zzz[0], zzz[1]
>>> zzz
array([[0.1, 0.2, 0.3, 0.4, 0.5],
[0.1, 0.2, 0.3, 0.4, 0.5]])
The reason is that Python first evaluates the right hand side completely and then make the assignment (this is why with Python single line swap is possible) but for a numpy array this is not True.原因是 Python 首先完全评估右侧,然后进行赋值(这就是为什么 Python 单行交换是可能的)但对于 numpy 数组,这不是真的。
numpy
麻木的
>>> arr = np.array([[1],[1]])
>>> arr[0], arr[1] = arr[0]+1, arr[0]
>>> arr
array([[2],
[2]])
Python
Python
>>> l = [1,1]
>>> l[0], l[1] = l[0]+1, l[0]
>>> l
[2, 1]
Try it like this :像这样尝试:
#!/usr/bin/env python
import random
import numpy as np
zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5],
[0.6, 0.7, 0.8, 0.9, 1. ]])
iterations = 100000
f = 0
for _ in range(iterations):
random.shuffle(zzz[0])
random.shuffle(zzz[1])
if np.array_equal(zzz[0], zzz[1]):
print(zzz)
f += 1
print(float(f)/float(iterations))
In [200]: zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5],
...: [0.6, 0.7, 0.8, 0.9, 1. ]])
...:
In [201]: zl = zzz.tolist()
In [202]: zl
Out[202]: [[0.1, 0.2, 0.3, 0.4, 0.5], [0.6, 0.7, 0.8, 0.9, 1.0]]
random.random
is probably using an in-place assignment like: random.random
可能正在使用就地分配,例如:
In [203]: zzz[0],zzz[1]=zzz[1],zzz[0]
In [204]: zzz
Out[204]:
array([[0.6, 0.7, 0.8, 0.9, 1. ],
[0.6, 0.7, 0.8, 0.9, 1. ]])
Note the replication.注意复制。
But applied to a list of lists:但应用于列表列表:
In [205]: zl[0],zl[1]=zl[1],zl[0]
In [206]: zl
Out[206]: [[0.6, 0.7, 0.8, 0.9, 1.0], [0.1, 0.2, 0.3, 0.4, 0.5]]
In [207]: zl[0],zl[1]=zl[1],zl[0]
In [208]: zl
Out[208]: [[0.1, 0.2, 0.3, 0.4, 0.5], [0.6, 0.7, 0.8, 0.9, 1.0]]
I tested zl = list(zzz)
and still got the array behavior.我测试了
zl = list(zzz)
并且仍然得到了数组行为。 This zl
is a list with views of zzz
.这个
zl
是一个包含zzz
视图的列表。 tolist
makes a list of lists that s totally independent of
zzz`. tolist
一个s totally independent of
zzz` 的列表列表。
In short random.random
cannot handle inplace modifications of a ndarray
correctly.总之
random.random
不能正确处理ndarray
就地修改。 np.random.shuffle
is designed to work with the 1st dim of an array, so it gets it right. np.random.shuffle
旨在处理数组的第一个np.random.shuffle
,因此它是正确的。
correct assignment for ndarray
is: ndarray
正确分配是:
In [211]: zzz = np.array([[0.1, 0.2, 0.3, 0.4, 0.5],
...: [0.6, 0.7, 0.8, 0.9, 1. ]])
...:
In [212]: zzz[[0,1]] = zzz[[1,0]]
In [213]: zzz
Out[213]:
array([[0.6, 0.7, 0.8, 0.9, 1. ],
[0.1, 0.2, 0.3, 0.4, 0.5]])
In [214]: zzz[[0,1]] = zzz[[1,0]]
In [215]: zzz
Out[215]:
array([[0.1, 0.2, 0.3, 0.4, 0.5],
[0.6, 0.7, 0.8, 0.9, 1. ]])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.