简体   繁体   English

筛选出较短的子列表

[英]Filtering out shorter sublists

I have a nested list: 我有一个嵌套列表:

[['spam', 'eggs'],
['spam', 'eggs', '111'],
['spam', 'eggs', 'foo'],
['spam', 'eggs', '111', 'bar'],
['spam', 'eggs', 'foo', 'bar']]

What I need is an algorithm to get indexes of shorter sublists, all elements of which are contained in longer ones. 我需要一种算法来获取较短子列表的索引,这些子列表的所有元素都包含在较长的子列表中。 In this example algorithm should return: 在此示例中,算法应返回:

[0, 1, 2]

Any help would be appreciated! 任何帮助,将不胜感激!

You can convert each sublist to a set, and use the helpful issubset method. 您可以将每个子列表转换为一个集合,并使用有用的issubset方法。 This will not work if you have duplicate elements in your lists that you need to preserve. 如果你在你的名单重复元素,你需要保留这将无法正常工作。

x = [set(i) for i in x]

x = [i
 for i, e in enumerate(x)
 if any(e.issubset(j) and i != k
        for k, j in enumerate(x))
 ]

# [0, 1, 2]

One way may be to use double for loop in same list and check with .issubset for those when not equal index : 一种方法是在同一列表中使用double for循环,并使用.issubset检查不相等的index

my_list = [['spam', 'eggs'],
            ['spam', 'eggs', '111'],
            ['spam', 'eggs', 'foo'],
            ['spam', 'eggs', '111', 'bar'],
            ['spam', 'eggs', 'foo', 'bar']]

indexes = []
for index1, item1 in enumerate(my_list):
    for index2, item2 in enumerate(my_list):
        if index1 != index2:
            if set(item1).issubset(item2):
                indexes.append(index1)
                break

print(indexes)

Result: 结果:

[0, 1, 2]
out_index = [i for i in range(len(my_list)) 
             if any(set(my_list[i]) < m 
             for m in [set(j) for j in my_list])]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM