有效地計算numpy數組中的排序排列

Question

我有一個numpy數組。 計算所有排序排列的最快方法是什么。

我的意思是，鑒於我的數組中的第一個元素，我想要一個按順序跟隨它的所有元素的列表。 然后給出第二個元素，列出其后面的所有元素。

所以給我的清單：b，c，＆d跟隨a。 c＆d跟隨b，d跟隨c。

x = np.array(["a", "b", "c", "d"])

所以潛在的輸出看起來像：

[
    ["a","b"],
    ["a","c"],
    ["a","d"],

    ["b","c"],
    ["b","d"],

    ["c","d"],
]

我需要做幾百萬次，所以我正在尋找一個有效的解決方案。

我嘗試過類似的東西：

im = np.vstack([x]*len(x))
a = np.vstack(([im], [im.T])).T
results = a[np.triu_indices(len(x),1)]

但它實際上比循環慢......

Answer 1

您可以使用itertools的函數，例如chain.from_iterable以及與np.fromiter combinations 。 這不涉及Python中的循環，但仍然不是純粹的NumPy解決方案：

>>> from itertools import combinations, chain
>>> arr = np.fromiter(chain.from_iterable(combinations(x, 2)), dtype=x.dtype)
>>> arr.reshape(arr.size/2, 2)
array([['a', 'b'],
       ['a', 'c'],
       ['a', 'd'],
       ..., 
       ['b', 'c'],
       ['b', 'd'],
       ['c', 'd']], 
      dtype='|S1')

時間比較：

>>> x = np.array(["a", "b", "c", "d"]*100)
>>> %%timeit
    im = np.vstack([x]*len(x))
    a = np.vstack(([im], [im.T])).T
    results = a[np.triu_indices(len(x),1)]
... 
10 loops, best of 3: 29.2 ms per loop
>>> %%timeit
    arr = np.fromiter(chain.from_iterable(combinations(x, 2)), dtype=x.dtype)
    arr.reshape(arr.size/2, 2)
... 
100 loops, best of 3: 6.63 ms per loop

Answer 2

我一直在瀏覽源代碼，看起來這tri函數最近都有了一些非常重要的改進。 該文件都是Python，因此如果有幫助，您可以將其復制到您的目錄中。

考慮到這一點，我似乎對Ashwini Chaudhary的時間完全不同。

了解要執行此操作的陣列的大小非常重要; 如果它很小，你應該緩存像triu_indices這樣的triu_indices 。

對我來說最快的代碼是：

def triangalize_1(x):
    xs, ys = numpy.triu_indices(len(x), 1)
    return numpy.array([x[xs], x[ys]]).T

除非x數組很小。

如果x很小，緩存最好：

triu_cache = {}
def triangalize_1(x):
    if len(x) in triu_cache:
        xs, ys = triu_cache[len(x)]

    else:
        xs, ys = numpy.triu_indices(len(x), 1)
        triu_cache[len(x)] = xs, ys

    return numpy.array([x[xs], x[ys]]).T

由於內存需求，我不會為大x做這個。

有效地計算numpy數組中的排序排列

問題描述

2 個解決方案

解決方案1
4 已采納 2014-12-06 19:00:58

解決方案2
2 2014-12-07 14:30:55

有效地計算numpy數組中的排序排列

問題描述

2 個解決方案

解決方案1 4 已采納 2014-12-06 19:00:58

解決方案2 2 2014-12-07 14:30:55

解決方案1
4 已采納 2014-12-06 19:00:58

解決方案2
2 2014-12-07 14:30:55