简体   繁体   中英

Find unique pairs of array with Python

I'm searching for a pythonic way to do this operation faster

import numpy as np
von_knoten = np.array([0, 0, 1, 1, 1, 2, 2, 2, 3, 4])
zu_knoten =  np.array([1, 2, 0, 2, 3, 0, 1, 4, 1, 2])
try:
    for i in range(0,len(von_knoten)-1):
        for j in range(0,len(von_knoten)-1):
            if (i != j) & ([von_knoten[i],zu_knoten[i]] == [zu_knoten[j],von_knoten[j]]):
                    print(str(i)+".column equal " +str(j)+".column")
                    von_knoten = sp.delete(von_knoten , j)
                    zu_knoten = sp.delete(zu_knoten , j)
                    print(von_knoten)
                    print(zu_knoten)
except:
    print('end')

so I need the fastest way to get

[0 0 1 1 4]
[1 2 2 3 2]

from

[0 0 1 1 1 2 2 2 3 4]
[1 2 0 2 3 0 1 4 1 2]

Thanks ;)

Some comments about your code; as-is, it does not do what you want, it shall print some stuff, did you even try to run it? Could you show us what you obtain?

  • first, simply do a range(len(von_knoten)) ; this will do what you want, as range starts at 0 by default, and ends one step before the end.

  • if you delete some items from the input lists, and try to access to items at end of them, you will likely obtain IndexError s, this before exhausting the analysis of your input lists.

  • you do some sp.delete but we do not know what that is (neither do the code), this will raise AttributeError s.

  • alas, please do not use except: . This will catch Exceptions you never dreamt of, and may explain why you don't understand what's wrong.


Then, what about using zip built-in function to obtain sorted two-dimensions tuples, and remove the duplicates ? Something like:

>>> von_knoten = [0, 0, 1, 1, 1, 2, 2, 2, 3, 4]
>>> zu_knoten =  [1, 2, 0, 2, 3, 0, 1, 4, 1, 2]
>>> set(tuple(sorted([m, n])) for m, n in zip(von_knoten, zu_knoten))
{(0, 1), (0, 2), (1, 2), (1, 3), (2, 4)}

I let you work around this to obtain the exact thing you're looking for.

You are trying to build up a collection of pairs you haven't seen before. You can use not in but need to check this either way round:

L = []
for x,y in zip(von_knoten, zu_knoten):
  if (x, y) not in L and (y, x ) not in L:
    L.append((x, y))

This gives a list of tuples

[(0, 1), (0, 2), (1, 2), (1, 3), (2, 4)]

which you can reshape.

Here's a vectorized output -

def unique_pairs(von_knoten, zu_knoten):
    s = np.max([von_knoten, zu_knoten])+1
    p1 = zu_knoten*s + von_knoten
    p2 = von_knoten*s + zu_knoten
    p = np.maximum(p1,p2)
    sidx = p.argsort(kind='mergesort')
    ps = p[sidx]
    m = np.concatenate(([True],ps[1:] != ps[:-1]))
    sm = sidx[m]
    return von_knoten[sm],zu_knoten[sm]

Sample run -

In [417]: von_knoten = np.array([0, 0, 1, 1, 1, 2, 2, 2, 3, 4])
     ...: zu_knoten =  np.array([1, 2, 0, 2, 3, 0, 1, 4, 1, 2])

In [418]: unique_pairs(von_knoten, zu_knoten)
Out[418]: (array([0, 0, 1, 1, 2]), array([1, 2, 2, 3, 4]))

Using np.unique and the void view method from here

def unique_pairs(a, b):
    c = np.sort(np.stack([a, b], axis = 1), axis = 1)
    c_view = np.ascontiguousarray(c).view(np.dtype((np.void,
                                          c.dtype.itemsize * c.shape[1])))
    _, i = np.unique(c_view, return_index = True)
    return a[i], b[i]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM