简体   繁体   中英

Python: Store indices of non-zero unique rows after comparing each rows with every other row in a matrix

For this matrix K=

 [[-1.  1.  0.]
 [ 0.  0.  0.]
 [ 0. -1.  1.]
 [ 0.  0.  0.]
 [ 0. -1.  1.]
 [ 0.  0.  0.]]

task is to store indices of non-zero unique rows in an array (here answer would be {0,2}), so that

K([0,2],:) 

is accessible for linear algebra operations. My attempt is:

myList = []    
for i in range(len(K)): #generate pairs
    for j in range(i+1,len(K)):  #travel down each other rows
        if np.array_equal(K[i],K[j]) and np.any(K[i] != 0, axis=1) and np.any(K[j] != 0, axis=1):
        myList.append(K[i])
        print ('indices of similar-non-zeros rows are\n',(i, j)),
        elif not np.array_equal(K[i],K[j]) and np.any(K[i] != 0,axis=1) and np.any(K[j] != 0, axis=1): 
        myList.append(K[i])
        print ('indices of non-similar-non-zeros rows are\n',(i, j)),
        else: 
            continue

new_K = np.asmatrix(np.asarray(myList))
new_new_K = np.unique(new_K,axis=0)
print('Now K is \n',new_new_K) 

The answer is:

    new_new_K = [[-1.  1.  0.]
                 [ 0. -1.  1.]]

Question-1: How to do it in pythonic way. The above is an alternative solution with matrix storage limitation, but storing indices in array is more preferable.

You can use a simple for loop with enumerate for this.

import numpy as np

A = np.array([[-1,  1,  0],
              [ 0,  0,  0],
              [ 0, -1,  1],
              [ 0,  0,  0],
              [ 0, -1,  1],
              [ 0,  0,  0]])

seen = {(0, 0, 0)}
res = []

for idx, row in enumerate(map(tuple, A)):
    if row not in seen:
        res.append(idx)
        seen.add(row)

Result

print(A[res])

[[-1  1  0]
 [ 0 -1  1]]

Example #2

import numpy as np

A=np.array([[0, 1, 0, 0, 0, 1],
            [0, 0, 0, 1, 0, 1],
            [0, 1, 0, 0, 0, 1],
            [1, 0, 1, 0, 1, 1],
            [1, 1, 1, 0, 0, 0],
            [0, 1, 0, 1, 0, 1],
            [0, 0, 0, 0, 0, 0]])

seen={(0, )*6}

res = []

for idx, row in enumerate(map(tuple, A)):
    if row not in seen:
        res.append(idx)
        seen.add(row)

print(A[res])

# [[0 1 0 0 0 1]
#  [0 0 0 1 0 1]
#  [1 0 1 0 1 1]
#  [1 1 1 0 0 0]
#  [0 1 0 1 0 1]]

You can use np.unique with its axis param to get the starting unique row indices and then filter out the only one row index whose corresponding row is all-zeros, like so -

def unq_row_indices_wozeros(a):
    # Get unique rows and their first occuring indices
    unq, idx = np.unique(a, axis=0, return_index=1)

    # Filter out the index, the corresponding row of which is ALL 0s
    return idx[(unq!=0).any(1)]

Sample run -

In [53]: # Setup input array with few all zero rows and duplicates
    ...: np.random.seed(0)
    ...: a = np.random.randint(0,9,(10,3))
    ...: a[[2,5,7]] = 0
    ...: a[4] = a[1]
    ...: a[8] = a[3]

In [54]: a
Out[54]: 
array([[5, 0, 3],
       [3, 7, 3],
       [0, 0, 0],
       [7, 6, 8],
       [3, 7, 3],
       [0, 0, 0],
       [1, 5, 8],
       [0, 0, 0],
       [7, 6, 8],
       [2, 3, 8]])

In [55]: unq_row_indices_wozeros(a)
Out[55]: array([6, 9, 1, 0, 3])

# Sort those indices if needed
In [56]: np.sort(unq_row_indices_wozeros(a))
Out[56]: array([0, 1, 3, 6, 9])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM