简体   繁体   中英

remove duplicates from 2d lists regardless of order

I have a 2d list

a = [[1, 2], [1, 3], [2, 1], [2, 3], [3, 1], [3, 2]]

How can I get the result:

result = [[1,2],[1,3],[2,3]]

Where duplicates are removed regardless of their order of the inner lists.

Try using a set to keep track of what lists you have seen:

from collections import Counter

a = [[1, 2], [1, 3], [2, 1], [2, 3], [3, 1], [3, 2], [1, 2, 1]]

seen = set()
result = []
for lst in a:
    current = frozenset(Counter(lst).items())
    if current not in seen:
        result.append(lst)
        seen.add(current)

print(result)

Which outputs:

[[1, 2], [1, 3], [2, 3], [1, 2, 1]]

Note: Since lists are not hash able, you can store frozensets of Counter objects to detect order less duplicates. This removes the need to sort at all.

In [3]: b = []
In [4]: for aa in a:
...:     if not any([set(aa) == set(bb) for bb in b if len(aa) == len(bb)]):
...:         b.append(aa)
In [5]: b
Out[5]: [[1, 2], [1, 3], [2, 3]]

While I like @RoadRunner's FrozenSet idea (sets are useful, and they let you find unique elements without re-inventing the wheel / trying to be smarter than the people that developed Python), you could also try something like this, where you're just trying to remove the reversed sub-list for each element. The downside is that it could be overly expensive if you have a bunch of non-duplicates:

a = [[1, 2], [1, 3], [2, 1], [2, 3], [3, 1], [3, 2]]

result = a.copy()
for x in result:
  try:
    result.remove([x[-1::-1])
  except:
    pass

>>> [[1, 2], [1, 3], [2, 3]]

This should work on arbitrary-sized sublists.

The 'Set' concept would come in handy here. The list you have (which contains duplicates) can be converted to a Set (which will never contain a duplicate). Find more about Sets here : Set

Example :

l = ['foo', 'foo', 'bar', 'hello']

A set can be created directly:

s = set(l)

now if you check the contents of the list

print(s)
>>> {'foo', 'bar', 'hello'}

Set will work this way with any iterable object! Hope it helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM