简体   繁体   中英

How to remove a sublist in nested list that are in another sublist?

I had a list:

a = [[2,3,4],[2,3,4],[2,3],[1,5,4],[1,5]]

I want to get:

b = [[2,3,4],[1,5,4]]

[2,3,4] is duplicated and [2,3], [1,5] is completely contained by [2,3,4] , [1,5,4] , so I want to remove it

I use set(frozenset(x) for x in a) to remove duplicate but I got stuck by how to remove [2,3],[1,5] which are contained by another sublist in a

You can convert the sub-lists in a to sets and sort them by length in reverse order, so that you can iterate through them and add each set to the output only if it is not a subset of any of the existing sets in the output:

output = []
for candidate in sorted(map(set, a), key=len, reverse=True):
    if not any(candidate <= incumbent for incumbent in output):
        output.append(candidate)

list(map(list, output)) returns:

[[2, 3, 4], [1, 4, 5]]

Sets are unordered, however, so if the original item orders in the sub-lists are important, you can take advantage of the fact that dict keys are ordered since Python 3.7 and map the sub-lists to dict keys instead:

output = []
for candidate in sorted(map(dict.fromkeys, a), key=len, reverse=True):
    if not any(candidate.keys() <= incumbent.keys() for incumbent in output):
        output.append(candidate)

so that list(map(list, output)) returns:

[[2, 3, 4], [1, 5, 4]]

If you're using Python 3.6 or earlier versions, where the order of dict keys are not guaranteed, you can use collections.OrderedDict in place of dict.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM