在numpy數組中查找相同的行和列

Question

我有一個 nxn 元素的 bolean 數組，我想檢查任何行是否與另一行相同。如果有任何相同的行，我想檢查相應的列是否也相同。

下面是一個例子：

A=np.array([[0, 1, 0, 0, 0, 1],
            [0, 0, 0, 1, 0, 1],
            [0, 1, 0, 0, 0, 1],
            [1, 0, 1, 0, 1, 1],
            [1, 1, 1, 0, 0, 0],
            [0, 1, 0, 1, 0, 1]])

我想讓程序找到第一行和第三行是一樣的，然后檢查第一行和第三列是否也一樣； 在這種情況下，它們是。

Answer 1

您可以使用np.array_equal() ：

for i in range(len(A)):  # generate pairs
    for j in range(i + 1, len(A)): 
        if np.array_equal(A[i], A[j]):  # compare rows
            if np.array_equal(A[:,i], A[:,j]):  # compare columns
                print(i, j)
        else:
            pass

或使用組合() ：

import itertools

for pair in itertools.combinations(range(len(A)), 2):
    if np.array_equal(A[pair[0]], A[pair[1]]) and np.array_equal(A[:,pair[0]], A[:,pair[1]]):  # compare columns
        print(pair)

Answer 2

從將np.unique應用於二維數組並讓它返回唯一對的典型方法開始：

def unique_pairs(arr):
    uview = np.ascontiguousarray(arr).view(np.dtype((np.void, arr.dtype.itemsize * arr.shape[1])))
    uvals, uidx = np.unique(uview, return_inverse=True)
    pos = np.where(np.bincount(uidx) == 2)[0]

    pairs = []
    for p in pos:
        pairs.append(np.where(uidx==p)[0])

    return np.array(pairs)

然后我們可以執行以下操作：

row_pairs = unique_pairs(A)
col_pairs = unique_pairs(A.T)

for pair in row_pairs:
    if np.any(np.all(pair==col_pairs, axis=1)):
        print pair

>>> [0 2]

當然還有很多優化要做，但重點是使用np.unique 。 與其他方法相比，此方法的效率在很大程度上取決於您如何定義“小”數組。

Answer 3

既然你說性能並不重要，這里是一個不是非常numpythonic的蠻力解決方案：

>>> n = len(A)
>>> for i1, row1 in enumerate(A):
...     offset = i1 + 1  # skip rows already compared 
...     for i2, row2 in enumerate(A[offset:], start=offset):
...         if (row1 == row2).all() and (A.T[i1] == A.T[i2]).all():
...             print i1, i2
...             
0 2

可能是 O(n^2)。 我使用轉置數組AT來檢查列也相等。

Answer 4

對於小數組，不依賴 Python 循環的另一種方法是通過 NumPy 廣播。

bool_array = np.logical_not(np.logical_xor(A[:,np.newaxis,:], A[np.newaxis,:,:])) # XNOR for comparison
matches_array = np.sum(bool_array, axis=2)  # count total matches for all elements in a row
row1, row2 = np.where(matches_array == A.shape[1]) # identical row = all elements in a row match
row1, row2 = row1[row2 > row1], row2[row2 > row1]  # filter self & duplicated comparisons
column_match = np.all(A[:,row1] == A[:,row2], axis=0)  # check if the corresponding columns are identical
for r1, r2, c in zip(row1, row2, column_match):
    print("Row %d and row %d : Column identical: %s" % (r1, r2, c))

如前所述，這種方法在 A 變大時不起作用，因為它在計算過程中需要 O(n^3) 內存存儲（由於bool_array ）

在numpy數組中查找相同的行和列

問題描述

4 個解決方案

解決方案1
4 已采納 2014-08-27 18:40:41

解決方案2
2 2014-08-27 19:00:22

解決方案3
1 2014-08-27 18:50:07

解決方案4
1 2021-04-08 07:28:18

在numpy數組中查找相同的行和列

問題描述

4 個解決方案

解決方案1 4 已采納 2014-08-27 18:40:41

解決方案2 2 2014-08-27 19:00:22

解決方案3 1 2014-08-27 18:50:07

解決方案4 1 2021-04-08 07:28:18

解決方案1
4 已采納 2014-08-27 18:40:41

解決方案2
2 2014-08-27 19:00:22

解決方案3
1 2014-08-27 18:50:07

解決方案4
1 2021-04-08 07:28:18