简体   繁体   English

使用 Numpy 打乱数组的列

[英]Shuffle columns of an array with Numpy

Let's say I have an array r of dimension (n, m) .假设我有一个维度为(n, m)的数组r I would like to shuffle the columns of that array.我想洗牌该数组的列。

If I use numpy.random.shuffle(r) it shuffles the lines.如果我使用numpy.random.shuffle(r)它会numpy.random.shuffle(r)线条。 How can I only shuffle the columns?我怎样才能只洗牌? So that the first column become the second one and the third the first, etc, randomly.使第一列成为第二列,第三列成为第一列,以此类推。

Example:示例:

input:输入:

array([[  1,  20, 100],
       [  2,  31, 401],
       [  8,  11, 108]])

output:输出:

array([[  20, 1, 100],
       [  31, 2, 401],
       [  11,  8, 108]])

One approach is to shuffle the transposed array:一种方法是打乱转置数组:

 np.random.shuffle(np.transpose(r))

Another approach (see YXD's answer https://stackoverflow.com/a/20546567/1787973 ) is to generate a list of permutations to retrieve the columns in that order:另一种方法(参见 YXD 的回答https://stackoverflow.com/a/20546567/1787973 )是生成一个排列列表以按该顺序检索列:

 r = r[:, np.random.permutation(r.shape[1])]

Performance-wise, the second approach is faster.在性能方面,第二种方法更快。

For a general axis you could follow the pattern:对于通用轴,您可以遵循以下模式:

>>> import numpy as np
>>> 
>>> a = np.array([[  1,  20, 100, 4],
...               [  2,  31, 401, 5],
...               [  8,  11, 108, 6]])
>>> 
>>> print a[:, np.random.permutation(a.shape[1])]
[[  4   1  20 100]
 [  5   2  31 401]
 [  6   8  11 108]]
>>> 
>>> print a[np.random.permutation(a.shape[0]), :]
[[  1  20 100   4]
 [  2  31 401   5]
 [  8  11 108   6]]
>>> 

So, one step further from your answer:因此,比您的答案更进一步:

Edit: I very easily could be mistaken how this is working, so I'm inserting my understanding of the state of the matrix at each step.编辑:我很容易弄错这是如何工作的,所以我在每一步插入我对矩阵状态的理解。

r == 1 2 3
     4 5 6
     6 7 8

r = np.transpose(r)  

r == 1 4 6
     2 5 7
     3 6 8           # Columns are now rows

np.random.shuffle(r)

r == 2 5 7
     3 6 8 
     1 4 6           # Columns-as-rows are shuffled

r = np.transpose(r)  

r == 2 3 1
     5 6 4
     7 8 6           # Columns are columns again, shuffled.

which would then be back in the proper shape, with the columns rearranged.然后将恢复到正确的形状,重新排列列。

The transpose of the transpose of a matrix == that matrix, or, [A^T]^T == A. So, you'd need to do a second transpose after the shuffle (because a transpose is not a shuffle) in order for it to be in its proper shape again.矩阵转置的转置 == 该矩阵,或者,[A^T]^T == A。因此,您需要在洗牌后进行第二次转置(因为转置不是洗牌)以使其再次处于适当的形状。

Edit: The OP's answer skips storing the transpositions and instead lets the shuffle operate on r as if it were.编辑:OP 的答案跳过存储换位,而是让 shuffle 像 r 一样对 r 进行操作。

In general if you want to shuffle a numpy array along axis i :一般来说,如果你想沿轴i随机播放一个 numpy 数组:

def shuffle(x, axis = 0):
    n_axis = len(x.shape)
    t = np.arange(n_axis)
    t[0] = axis
    t[axis] = 0
    xt = np.transpose(x.copy(), t)
    np.random.shuffle(xt)
    shuffled_x = np.transpose(xt, t)
    return shuffled_x

shuffle(array, axis=i)
>>> print(s0)
>>> [[0. 1. 0. 1.]
     [0. 1. 0. 0.]
     [0. 1. 0. 1.]
     [0. 0. 0. 1.]]
>>> print(np.random.permutation(s0.T).T)
>>> [[1. 0. 1. 0.]
     [0. 0. 1. 0.]
     [1. 0. 1. 0.]
     [1. 0. 0. 0.]]

np.random.permutation(), does the row permutation. np.random.permutation(),进行行排列。

There is another way, which does not use transposition and is apparently faster :还有另一种方法,它不使用换位并且显然更快

np.take(r, np.random.permutation(r.shape[1]), axis=1, out=r)

CPU times: user 1.14 ms, sys: 1.03 ms, total: 2.17 ms. CPU 时间:用户 1.14 毫秒,系统:1.03 毫秒,总计:2.17 毫秒。 Wall time: 3.89 ms挂墙时间:3.89 毫秒

The approach in other answers: np.random.shuffle(rT)其他答案中的方法: np.random.shuffle(rT)

CPU times: user 2.24 ms, sys: 0 ns, total: 2.24 ms Wall time: 5.08 ms CPU 时间:用户 2.24 ms,系统:0 ns,总计:2.24 ms Wall time:5.08 ms

I used r = np.arange(64*1000).reshape(64, 1000) as an input.我使用r = np.arange(64*1000).reshape(64, 1000)作为输入。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM