简体   繁体   中英

Comparing tuple values inside list of lists?

I have a list as follows:

mylist=[[(1, 1)], [(1, 1), (1, 2)], [(1, 1), (1, 2), (1, 3)], [(1, 1), (1, 2), (1, 4)]]

Now, what I want is every element of this list is compared with all the other elements and if that element is the subset of the elements it is compared with, it should be popped. For example, [(1, 1)] is the subset of [(1, 1), (1, 2)] then [(1, 1)] should be popped from the list . Similarly, [(1, 1), (1, 2)] is the subset of [(1, 1), (1, 2), (1, 3)] then it should be popped also.

And in this case, we get the output as follows:

[[(1, 1), (1, 2), (1, 3)], [(1, 1), (1, 2), (1, 4)]]

I tried searching for all the possible answers but none was aimed at this particular case.

So far I have tried the following method but of little use:

for i, e in enumerate(mylist):
mylist[i] = tuple(e)
mylist = list(set(mylist))

You need to remove any list from mylist where all the tuples in the list are present in another list in mylist . This is most easily done by assigning to a new list:

newlist = []
for i, lst in enumerate(mylist):
    if not any(all(t in l for t in lst) for l in mylist[:i] + mylist[i+1:]):
        newlist.append(lst)

Or as a list comprehension:

newlist = [lst for i, lst in enumerate(mylist) if not any(all(t in l for t in lst) for l in mylist[:i] + mylist[i+1:])]

In both cases, for your sample data the output is:

[
 [(1, 1), (1, 2), (1, 3)],
 [(1, 1), (1, 2), (1, 4)]
]

For larger lists this might become slow, in which case you can speed it up by first mapping the entries in mylist to sets:

mylist=[[(1, 1), (1, 2)], [(1, 1), (1, 2), (1, 3)], [(1, 1), (1, 2), (1, 4)], [(1, 1)]] 
mylist=list(map(set, (tuple(l) for l in mylist)))
newlist = [list(lst) for i, lst in enumerate(mylist) if not any(lst.issubset(l) for l in mylist[:i] + mylist[i+1:])]

You can use frozenset.issubset and do the comparaison like this example:

Thanks to @Nick suggestion, this is a more elaborated example:

mylist=[[(1, 1)], [(1, 1), (1, 2)], [(1, 1), (1, 2), (1, 3)], [(1, 1), 
(1, 2), (1, 4)]] 
out = [] 

for k, elm in enumerate(mylist):  
   for elm2 in mylist[:k] + mylist[k + 1:]:  
       if frozenset(elm).issubset(elm2):  
           break  
   else:  
       out.append(elm)    

print(out)

Output:

[[(1, 1), (1, 2), (1, 3)], [(1, 1), (1, 2), (1, 4)]]

Neither solutions from @Nick and @ChihebNexus are efficient.

The answer from @Nick requires a time complexity of O(m ^ 2 xn ^ 2) , while @ChihebNexus's answer requires a time complexity of O(m ^ 2 xn) , where m is the length of the input list and n is the average length of the sub-lists.

For an approach that requires just a time complexity of O(mxn) , you can create a dict that maps each tuple item to a set of the sub-lists the item appears in, keeping in mind that these sub-lists need to be converted to tuples first to become hashable and be added to a set:

mapping = {}
for lst in mylist:
    for item in lst:
        mapping.setdefault(item, set()).add(tuple(lst))

so that with your sample input, mapping becomes:

{(1, 1): {((1, 1),),
          ((1, 1), (1, 2)),
          ((1, 1), (1, 2), (1, 3)),
          ((1, 1), (1, 2), (1, 4))},
 (1, 2): {((1, 1), (1, 2), (1, 3)), ((1, 1), (1, 2)), ((1, 1), (1, 2), (1, 4))},
 (1, 3): {((1, 1), (1, 2), (1, 3))},
 (1, 4): {((1, 1), (1, 2), (1, 4))}}

And then with the mappings of items to their belonging sub-lists built, we can then iterate through the sub-lists again, and take the intersection of the sets of sub-lists that the items in the current sub-list map to, in order to find the sub-lists that contain all the items in the current sub-list. If there are more than one of such qualifying sub-lists, it means that the current sub-list is a subset of the other qualifying sub-lists, and that we can remove the current sub-list from the result by removing it from all the sets its items map to. The sub-lists that survive this process will be the ones we want in the output, which we can obtain by aggregating the sets with a union operation:

for lst in mylist:
    if len(set.intersection(*map(mapping.get, lst))) > 1:
        t = tuple(lst)
        for item in lst:
            mapping[item].remove(t)
print(set.union(*mapping.values()))

This outputs:

{((1, 1), (1, 2), (1, 3)), ((1, 1), (1, 2), (1, 4))}

You can convert it to a list of lists if you really want the exact data types in the question:

list(map(list, set.union(*mapping.values())))

which returns:

[[(1, 1), (1, 2), (1, 3)], [(1, 1), (1, 2), (1, 4)]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM