Python中向量的所有组合的集合

Question

I'm having an issue with regards to creating what can be seen as the cartesian product of an array of vectors in Python. 关于在Python中创建向量数组的笛卡尔积，我遇到了一个问题。 I have a code that gives the all possible partitions of a number n over r variables, and returns it as a numpy array. 我有一个代码，给出了r变量上数字n的所有可能分区，并将其作为numpy数组返回。 What I would like to do is to be able to call that code an arbitrary number of times, and then produce a set of all possible combinations of the arrays. 我想做的是能够多次调用该代码，然后生成一组数组的所有可能组合。

So to give an example, I might call the partition code and each successive call (for a varying parameter set) 因此，举个例子，我可能会调用分区代码和每个后续调用（对于可变的参数集）

array([[2,0],[1,1],[2,0]])
array([[1,0],[0,1]])
array([[0,0]])

What I'm looking for is to be able to return the set 我正在寻找的是能够退还套装

array([[2,0],[1,0],[0,0]])
array([[2,0],[0,1],[0,0]])
array([[1,1],[1,0],[0,0]])
.....

either as an overall array, or returning it line by line (due to the obvious memory issues as the size of the number being partitioned grows). 要么作为一个整体数组，要么逐行返回（由于明显的内存问题，因为要分区的数字的大小增加了）。

Previously I had solved this problem using itertools.product, and run the code under PyPy. 以前，我已经使用itertools.product解决了此问题，并在PyPy下运行代码。 However, I have had to switch from PyPy to standard python due to the need for Numpy in other parts of the project, and I'm trying to replicate the speeds of the PyPy code through the use of Numpy. 但是，由于项目的其他部分需要Numpy，我不得不从PyPy切换到标准python，并且我正在尝试通过使用Numpy复制PyPy代码的速度。 I've managed to get this working really roughly, but the code spent so much time changing between data types in order to try and bootstrap a solution together that it's impractical for implementation. 我已经设法使该工作大体上可行，但是代码花了很多时间在数据类型之间进行更改，以便尝试将一个解决方案引导到一起，这对于实现是不切实际的。

I was wondering if anybody would be able to help me out with providing a little guidance as to how I should progress with this in Python. 我想知道是否有人可以帮助我提供一些有关如何在Python中进行此开发的指导。

Thanks 谢谢

Answer 1

This should get you started: 这应该使您开始：

import numpy as np
import itertools as it

def row_product(*arrays):
    lengths = np.array([x.shape[0] for x in arrays])
    positions = np.cumsum(lengths)

    ranges = np.arange(positions[-1])
    ranges = np.split(ranges,positions[:-1])

    total = np.concatenate((arrays),axis=0)

    inds = np.fromiter(it.chain.from_iterable(it.product(*ranges)), np.int)
    inds = inds.reshape(-1, len(arrays))

    return np.take(total, inds, axis=0)

The last dimension(s) must be the same. 最后的尺寸必须相同。

Showing the results: 显示结果：

a=np.array([[2,0],[1,1],[2,0]])
b=np.array([[1,0],[0,1]])
c=np.array([[0,0]])

print row_product(a,b,c)

[[[2 0]
  [1 0]
  [0 0]]

 [[2 0]
  [0 1]
  [0 0]]

 [[1 1]
  [1 0]
  [0 0]]

 [[1 1]
  [0 1]
  [0 0]]

 [[2 0]
  [1 0]
  [0 0]]

 [[2 0]
  [0 1]
  [0 0]]]

This is a 3D array where the unique combinations are in the last two axes. 这是一个3D数组，其中唯一的组合位于最后两个轴中。 Seems to be reasonably fast, 1M unique combinations takes about 1/6 of a second. 看起来相当快，1M唯一组合大约需要1/6秒。

Python中向量的所有组合的集合

问题描述

1 个解决方案

解决方案1
1 已采纳 2013-09-05 02:07:50

Python中向量的所有组合的集合

问题描述

1 个解决方案

解决方案1 1 已采纳 2013-09-05 02:07:50

解决方案1
1 已采纳 2013-09-05 02:07:50