笛卡尔积可获取一组索引以指向NumPy数组中的唯一元素

Question

Whats a good way to get combinations of indices that points to unique elements in array. 什么是获取指向数组中唯一元素的索引组合的好方法。 For example a = [1,1,3,2] , the possible set of pointers would be {0,2,3}, {1,2,3} . 例如a = [1,1,3,2] ，可能的指针集将为{0,2,3}, {1,2,3} 。

I can use argsort in combination with splitting the elements by frequency to then use something like itertools.product to get all sets of indices I want. 我可以结合使用argsort和按频率分割元素，然后使用itertools.product东西来获取我想要的所有索引集。

This is what I tried: 这是我尝试的：

from numpy import array, split
from scipy.stats import itemfreq
from itertools import product
a = array([1,1,3,2])
fq = itemfreq(a)[:,1]
fq = [int(f + sum(fq[:i])) for i, f in enumerate(fq)]
print list(product(*(ptrs for ptrs in split(a.argsort(), fq) if len(ptrs))))
#> [(0, 3, 2), (1, 3, 2)]

How can I do this better? 我该如何做得更好？

Answer 1

This does get you the indices, but possibly not in the format you want: 这确实会为您提供索引，但可能不是您想要的格式：

[np.where(a==x) for x in np.unique(a)]

[(array([0, 1]),), (array([3]),), (array([2]),)]

I imagine there is a better way, without the for loop. 我想有一个更好的方法，没有for循环。

Answer 2

@atomh33ls's answer can be vectorized as follows. @ atomh33ls的答案可以向量化如下。

First, extract the inverse indices and counts of each unique item. 首先，提取每个唯一项的反索引和计数。 If you are using numpy >= 1.9: 如果您使用numpy> = 1.9：

_, idx, cnt = np.unique(a, return_inverse=True, return_counts=True)

In older versions, this does the same: 在旧版本中，此操作相同：

_, idx = np.unique(a, return_inverse=True)
cnt = np.bincount(idx)

And now, a little bit of magic and, voila: 现在，一点点魔术，瞧：

>>> np.split(np.arange(len(a))[np.argsort(idx)], np.cumsum(cnt)[:-1])
[array([0, 1]), array([3]), array([2])]

笛卡尔积可获取一组索引以指向NumPy数组中的唯一元素

问题描述

2 个解决方案

解决方案1
3 已采纳 2014-09-29 12:56:01

解决方案2
1 2014-09-29 15:06:40

笛卡尔积可获取一组索引以指向NumPy数组中的唯一元素

问题描述

2 个解决方案

解决方案1 3 已采纳 2014-09-29 12:56:01

解决方案2 1 2014-09-29 15:06:40

解决方案1
3 已采纳 2014-09-29 12:56:01

解决方案2
1 2014-09-29 15:06:40