简体   繁体   中英

Cleanest way to remove common list elements across multiple lists in python

I have n lists of numbers. I want to make sure that each list contains unique elements to that particular list. Ie There are no "shared" duplicates across any of the rest.
This is really easy to do with two lists, but a little trickier with n lists.

e.g.   
mylist = [
[1, 2, 3, 4],
[2, 5, 6, 7],
[4, 2, 8, 9]
]

becomes:

mylist = [
[1, 3],
[5, 6, 7],
[8, 9]
]
from collections import Counter
from itertools import chain

mylist = [
    [1,2,3,4],
    [2,5,6,7,7],
    [4,2,8,9]
]

counts = Counter(chain(*map(set,mylist)))

[[i for i in sublist if counts[i]==1] for sublist in mylist]
#[[1, 3], [5, 6, 7, 7], [8, 9]]

This does it in linear time, 2 passes. I'm assuming you want to preserve duplicates within a list; if not, this can be simplified a bit:

>>> import collections, itertools
>>> counts = collections.defaultdict(int)
>>> for i in itertools.chain.from_iterable(set(l) for l in mylist):
...     counts[i] += 1
... 
>>> for l in mylist:
...     l[:] = (i for i in l if counts[i] == 1)
... 
>>> mylist
[[1, 3], [5, 6, 7], [8, 9]]

Since you don't care about order, you can easily remove duplicates using set subtraction and converting back to list. Here it is in a monster one-liner:

>>> mylist = [
... [1, 2, 3, 4],
... [2, 5, 6, 7],
... [4, 2, 8, 9]
... ]
>>> mynewlist = [list(set(thislist) - set(element for sublist in mylist for element in sublist if sublist is not thislist)) for thislist in mylist]
>>> mynewlist
[[1, 3], [5, 6, 7], [8, 9]]

Note: This is not very efficient because duplicates are recomputed for each row. Whether this is a problem or not depends on your data size.

set() is the right approach. although you don't have to use a list comprehension.

Without additional imports:

mylist = [
[1, 2, 3, 4],
[2, 5, 6, 7],
[4, 2, 8, 9]
]
>>> result_list = []
>>> for test_list in mylist:
...     result_set = set(test_list)
...     for compare_list in mylist:
...         if test_list != compare_list:
...             result_set = result_set - set(compare_list)
...     result_list.append(result_set)
...
>>> result_list
[set([1, 3]), set([5, 6, 7]), set([8, 9])]

This is my solution, using Counter to build a set of all the common numbers, and then it just does a set difference:

from collections import Counter

def disjoin(lsts):
    c = Counter(num for lst in lsts for num in lst)
    common = set(x for x,v in c.items() if v > 1)
    result = []
    for lst in lsts:
        result.append(set(lst) - common)
    return result

Example:

>>> remove_common(mylist)
[set([1, 3]), set([5, 6, 7]), set([8, 9])]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM