[英]Fastest way to check if a list is present in a list of lists
I have a list 我有一份清单
a=[[1,2,3,4,5,6],[7,8,9,10,11,12]]
What is the fastest way to check if any list in a
is present in another list of lists b
, where 什么是检查是否在任何名单最快的方法
a
出现在名单另一个列表b
,其中
b=[[5, 9, 25, 31, 33, 36],[7,8,9,10,11,12],[10, 13, 22, 24, 33, 44]]
If any list in a is present in b, I would like to remove it. 如果b中存在a中的任何列表,我想将其删除。 I'm currently using this code:
我目前正在使用此代码:
for each in a:
for item in b:
if set(each).issubset(item)
a.remove(each)
This works but is quite slow when working with large lists so was wondering if there's a better way. 这有效但在处理大型列表时速度很慢,因此想知道是否有更好的方法。 The above code gives me the following result:
上面的代码给出了以下结果:
print(a)
[[1, 2, 3, 4, 5, 6]]
I am not worried about order, for example if a list in a
is [1,2,3,4,5,6]
I want it to be removed if there exist a list [1,2,3,4,5,6]
or [3,4,1,6,2,5]
etc in list b
. 我不担心顺序,例如,如果列表中的
a
是[1,2,3,4,5,6]
我想,如果存在一个名单将它移走[1,2,3,4,5,6]
列表b
[1,2,3,4,5,6]
或[3,4,1,6,2,5]
等。
Using a list comprehension
with set
. 使用
list comprehension
与set
。
Ex: 例如:
a=[[1,2,3,4,5,6],[7,8,9,10,11,12]]
b=[[5, 9, 25, 31, 33, 36],[7,8,9,10,11,12],[10, 13, 22, 24, 33, 44]]
setA = set(map(tuple, a))
setB = set(map(tuple, b))
print([i for i in setA if i not in setB])
Output: 输出:
[(1, 2, 3, 4, 5, 6)]
A functional solution is possible using set.difference
: 使用
set.difference
可以实现功能解决方案:
res = set(map(tuple, a)).difference(set(map(tuple, b)))
[(1, 2, 3, 4, 5, 6)]
Explanation 说明
list
is not a hashable type, we convert sublists to type tuple
, which are immutable and hashable, eg set(map(tuple, a))
. list
不是hashable类型,我们将子列表转换为类型tuple
,它们是不可变的和可散列的,例如set(map(tuple, a))
。 set.difference
to take the difference between the 2 resulting sets. set.difference
来获取2个结果集之间的差异。 If you don't care about elements order and frequencies, ie treat lists as unordered sets, then probably your solution is almost the correct one (removing an element while iterating the same list is probably not the best idea) with two serious suboptimalities. 如果您不关心元素顺序和频率,即将列表视为无序集合,那么可能您的解决方案几乎是正确的(删除元素,而迭代相同的列表可能不是最好的想法),具有两个严重的次优。
First, you currently convert each b's element into a set once per each element of a. 首先,您目前将每个b的元素转换为每个元素的一个集合。 I wonder if Python compiler can optimize the repeated work out, at least you could try to do it on your own.
我想知道Python编译器是否可以优化重复的工作,至少你可以尝试自己做。
Next, you don't need to remove elements incorrectly and quadratically to simply filter them out. 接下来,您不需要错误地和平方地删除元素以简单地过滤它们。
faster_b = [frozenset(x) for x in b]
def not_in_b(list):
l = frozenset(list)
for x in faster_b:
if l <= x: return False
return True
print(list(filter(not_in_b, a)))
This one is probably faster. 这个可能更快。
$ python3
Python 3.6.5 (default, May 11 2018, 04:00:52)
[GCC 8.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a=[[1,2,3,4,5,6],[7,8,9,10,11,12]]
>>> b=[[5, 9, 25, 31, 33, 36],[7,8,9,10,11,12],[10, 13, 22, 24, 33, 44]]
>>> faster_b = [frozenset(x) for x in b]
>>>
>>> def not_in_b(list):
... l = frozenset(list)
... for x in faster_b:
... if l <= x: return False
... return True
...
>>> print(list(filter(not_in_b, a)))
[[1, 2, 3, 4, 5, 6]]
>>> a=[[1, 1, 2, 3]]
>>> b=[[7, 3, 2, 1], [4, 5, 6]]
>>> faster_b = [frozenset(x) for x in b]
>>> print(list(filter(not_in_b, a)))
[]
>>> a=[[1, 1, 2, 3], [42, 5, 6]]
>>> print(list(filter(not_in_b, a)))
[[42, 5, 6]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.