简体   繁体   English

numpy 数组到置换矩阵

[英]numpy array to permutation matrix

np.array([1,2,3])

I've got numpy array.我有 numpy 数组。 I would like to turn it into a numpy array with tuples of each 1:1 permutation.我想把它变成一个 numpy 数组,每个 1:1 排列的元组。 Like this:像这样:

np.array([
    [(1,1),(1,2),(1,3)],
    [(2,1),(2,2),(2,3)],
    [(3,1),(3,2),(3,3)],
])

Any thoughts on how to do this efficiently?关于如何有效地做到这一点的任何想法? I need to do this operation a few million times.我需要做这个操作几百万次。

You can do something like this:你可以这样做:

>>> a = np.array([1, 2, 3])
>>> n = a.size
>>> np.vstack((np.repeat(a, n), np.tile(a, n))).T.reshape(n, n, 2)
array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

Or as suggested by @Jaime you can get around 10x speedup if we take advantage of broadcasting here:或者按照@Jaime 的建议,如果我们在这里利用广播,您可以获得大约 10 倍的加速:

>>> a = np.array([1, 2, 3])
>>> n = a.size                 
>>> perm = np.empty((n, n, 2), dtype=a.dtype)
perm[..., 0] = a[:, None]
perm[..., 1] = a
... 
>>> perm
array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

Timing comparisons:时序比较:

>>> a = np.array([1, 2, 3]*100)
>>> %%timeit                   
np.vstack((np.repeat(a, n), np.tile(a, n))).T.reshape(n, n, 2)
... 
1000 loops, best of 3: 934 µs per loop
>>> %%timeit                   
perm = np.empty((n, n, 2), dtype=a.dtype)                     
perm[..., 0] = a[:, None]
perm[..., 1] = a
... 
10000 loops, best of 3: 111 µs per loop

If you're working with numpy, don't work with tuples.如果您正在使用 numpy,请不要使用元组。 Use its power and add another dimension of size two.使用它的力量并添加另一个大小为 2 的维度。 My recommendation is:我的建议是:

x = np.array([1,2,3])
np.vstack(([np.vstack((x, x, x))], [np.vstack((x, x, x)).T])).T

or:或者:

im = np.vstack((x, x, x))
np.vstack(([im], [im.T])).T

And for a general array:对于一般数组:

ix = np.vstack([x for _ in range(x.shape[0])])
return np.vstack(([ix], [ix.T])).T

This will produce what you want:这将产生你想要的:

array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

But as a 3D matrix, as you can see when looking at its shape:但是作为一个 3D 矩阵,正如您在查看其形状时所看到的:

Out[25]: (3L, 3L, 2L)

This is more efficient than the solution with permutations as the array size get's bigger.随着数组大小变大,这比具有排列的解决方案更有效。 Timing my solution against @Kasra's yields 1ms for mine vs. 46ms for the one with permutations for an array of size 100. @AshwiniChaudhary's solution is more efficient though.针对@Kasra 的解决方案计时,我的解决方案为 1 毫秒,而对于大小为 100 的数组进行排列的解决方案为 46 毫秒。不过,@AshwiniChaudhary 的解决方案更有效。

Yet another way using numpy.meshgrid .另一种使用numpy.meshgrid

>>> x = np.array([1, 2, 3])
>>> perms = np.stack(np.meshgrid(x, x))
>>> perms
array([[[1, 2, 3],
        [1, 2, 3],
        [1, 2, 3]],

       [[1, 1, 1],
        [2, 2, 2],
        [3, 3, 3]]])
>>> perms.transpose().reshape(9, 2)
array([[1, 1],
       [1, 2],
       [1, 3],
       [2, 1],
       [2, 2],
       [2, 3],
       [3, 1],
       [3, 2],
       [3, 3]])

I was looking into how to do this better in general, not just for 2-tuples.我正在研究如何在一般情况下做得更好,而不仅仅是针对 2 元组。 It can actually be done pretty elegantly using np.indices , which can be used to produce a set of indices to index the original array:它实际上可以使用np.indices非常优雅地完成,它可用于生成一组索引来索引原始数组:

>>> x = np.array([1, 2, 3])
>>> i = np.indices((3, 3)).reshape(2, -1)
>>> a[i].T
array([[1, 1],
       [1, 2],
       [1, 3],
       [2, 1],
       [2, 2],
       [2, 3],
       [3, 1],
       [3, 2],
       [3, 3]])

The general case is done as follows: let n be the number of items in each permutation.一般情况如下:让n是每个排列中的项目数。

n = 5
x = np.arange(10)

i = np.indices([x.size for _ in range(n)]).reshape(n, -1)
a = x[i].T

Then you can reshape the result to the n-dimensional array form if needed, but often having the permutations is enough.然后,如果需要,您可以将结果重塑为 n 维数组形式,但通常有排列就足够了。 I didn't test the performance of this method, but certainly native numpy calls and indexing ought to be pretty quick.我没有测试这个方法的性能,但当然原生 numpy 调用和索引应该非常快。 At least this is more elegant than the other solutions in my opinion.至少在我看来,这比其他解决方案更优雅。 And this is pretty similar to the meshgrid solution provided by @Bill .这是非常相似所提供的解决方案meshgrid @Bill

You can use itertools.product to get the permutations , then convert the result to numpy array.您可以使用itertools.product获取 permutations ,然后将结果转换为numpy数组。

>>> from itertools import product
>>> p=list(product(a,repeat=2))
>>> np.array([p[i:i+3] for i in range(0,len(p),3)])
array([[[1, 1],
        [1, 2],
        [1, 3]],

       [[2, 1],
        [2, 2],
        [2, 3]],

       [[3, 1],
        [3, 2],
        [3, 3]]])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM