[英]Intersect a set with a list of sets in python
I have a set s
and a list of set l
as below. 我有一个set s
和一个set l
列表,如下所示。
s = {1,2,3,4}
l = [{1}, {1,2,3}, {3}]
The output should be 输出应为
out = [{1}, {1,2,3}, {3}]
I am using the following code to accomplish it. 我正在使用以下代码来完成它。 But I was hoping there would be a faster way? 但是我希望会有更快的方法吗? Perhaps some sort of broadcasting? 也许某种广播?
out = [i.intersection(s) for i in l]
EDIT 编辑
List l
can be as long as 1000 elements long. 列表l
可以长达1000个元素。
My end objective is to create a matrix which has the length of elements of the pairwise intersection of elements of l
. 我的最终目标是创建一个矩阵,该矩阵的元素长度为l
的成对交点。 So s
is an element of l
. 所以s
是l
的元素。
out_matrix = list()
for s in l:
out_matrix.append([len(i.intersection(s)) for i in l])
My first thought when reading this question was "sure, use numpy
". 阅读此问题时,我的第一个念头是“确定,请使用numpy
”。 Then I decided to do some tests: 然后我决定做一些测试:
import numpy as np
from timeit import Timer
s = {1, 2, 3, 4}
l = [{1}, {1, 2, 3}, {3}] * 1000 # 3000 elements
arr = np.array(l)
def list_comp():
[i.intersection(s) for i in l]
def numpy_arr():
arr & s
print(min(Timer(list_comp).repeat(500, 500)))
print(min(Timer(numpy_arr).repeat(500, 500)))
This outputs 这个输出
# 0.05513364499999995
# 0.035647999999999236
So numpy
is indeed a bit faster. 所以numpy
确实更快。 Does it really worth it? 真的值得吗? not sure. 不确定。 A ~0.02
seconds difference for a 3000 elements list is neglectable (especially if considering the fact that my test didn't even take into account the time it took to create arr
). 3000个元素的列表之间相差~0.02
秒(可以忽略不计)(尤其是考虑到我的测试甚至没有考虑创建arr
花费的时间这一事实)。
Keep in mind that even when using numpy
we are still in the grounds of O(n). 请记住,即使使用numpy
我们仍然处于O(n)的numpy
。 The difference is due to the fact that numpy
pushes the for
loop down to the C level, which is inherently faster than a Python for
loop. 差异是由于numpy
将for
循环下推到C级这一事实,其本质上比Python for
循环快。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.