[英]How to count the number of sublists based on common elements from a nested list in python?
I have a list of lists like this: [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]
. 我有一个这样的列表列表:
[[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]
。 How can I count the lists which are sublists of more than two lists? 如何计算属于两个以上列表的子列表的列表? For example, here
[2, 3] and [3, 4]
would be the lists that are sublists of first 3 lists. 例如,这里
[2, 3] and [3, 4]
将是前3个列表的子列表。 I want to get rid of them. 我想摆脱它们。
This comprehension should do it: 这种理解应该做到:
data = [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]
solution = [i for i in data if sum([1 for j in data if set(i).issubset(set(j))]) < 3]
set_list = [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]
check_list = [[2, 3], [3, 4]]
sublist_to_list = {}
for set in set_list:
for i, sublist in enumerate(check_list):
count = 0
for element in sublist:
if element in set:
count += 1
if count == len(sublist):
if i not in sublist_to_list:
sublist_to_list[i] = [set]
else:
sublist_to_list[i].append(set)
print(sublist_to_list)
Output: {0: [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3]], 1: [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [3, 4]]}
输出:
{0: [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3]], 1: [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [3, 4]]}
You can first make a function that gets sub lists of a list: 您可以首先创建一个获取列表子列表的函数:
def sublists(lst):
length = len(lst)
for size in range(1, length + 1):
for start in range(length - size + 1):
yield lst[start:start+size]
Which works as follows: 其工作原理如下:
>>> list(sublists([1, 2, 3, 4, 5]))
[[1], [2], [3], [4], [5], [1, 2], [2, 3], [3, 4], [4, 5], [1, 2, 3], [2, 3, 4], [3, 4, 5], [1, 2, 3, 4], [2, 3, 4, 5], [1, 2, 3, 4, 5]]
Then you can use this to collect all the sublists list indices into a collections.defaultdict
: 然后,您可以使用它来将所有子列表列表索引
collections.defaultdict
到collections.defaultdict
:
from collections import defaultdict
lsts = [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]
d = defaultdict(list)
for i, lst in enumerate(lsts):
subs = sublists(lst)
while True:
try:
curr = tuple(next(subs))
d[curr].append(i)
except StopIteration:
break
Which will have tuple keys for the sublists, and the list indices as the values. 它将具有用于子列表的元组键,以及列表索引作为值。
Then to determine sub lists that occur more than twice in all the lists, you can check if the set of all the indices has a length of more than two: 然后,要确定在所有列表中出现两次以上的子列表,可以检查所有索引的集合的长度是否大于两个:
print([list(k) for k, v in d.items() if len(set(v)) > 2])
Which will give the following sublists: 这将给出以下子列表:
[[2], [3], [4], [5], [2, 3], [3, 4], [4, 5], [2, 3, 4], [3, 4, 5], [2, 3, 4, 5]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.