[英]Get Union of all Sets that don't Intersect more efficiently
I would like to get the union of two set of frozensets
. 我想得到两组
frozensets
集的frozensets
。 I'm only interested in the union of frozensets
that don't intersect. 我只对不相交的
frozensets
的联合感兴趣。 Another way to look at it is that I'm only interested in unions that have a length equal to the total length of both frozensets
combined. 另一种看待它的方式是,我只对长度等于两个
frozensets
集合的总长度相等的并集frozensets
。 Ideally I would like to ignore any frozensets
that don't intersect with each other for a massive speedup. 理想情况下,我想忽略任何不会相互交叉的
frozensets
,从而大大提高了速度。 I expect many frozensets
to have at least one element in common. 我希望许多
frozensets
至少具有一个共同点。 Here is the code I have so far in python. 这是我到目前为止在python中拥有的代码。 I would like it to be as fast as possible as I'm working with a large dataset.
我希望它在处理大型数据集时尽可能快。 Each of the
frozensets
are no more then 20 elements but there will be somewhere around 1,000 total in a set. 每个
frozensets
不超过20个元素,但一组中总共约有1,000个元素。 All numbers will be between 0 and 100. I'm open to converting to other types if it would allow my program to run faster but I don't want any repeated elements and order is not important. 所有数字都在0到100之间。如果它允许我的程序运行更快,但我愿意转换为其他类型,但是我不希望任何重复的元素并且顺序并不重要。
sets1 = set([frozenset([1,2,3]),frozenset([4,5,6]),frozenset([8,10,11])])
sets2 = set([frozenset([8,9,10]),frozenset([6,7,3])])
newSets = set()
for fset in sets1:
for fset2 in sets2:
newSet = fset.union(fset2)
if len(newSet) == len(fset)+len(fset2):
newSets.add(frozenset(newSet))
the correct output is 正确的输出是
set(frozenset([1,2,3,8,9,10]),frozenset([4,5,6,8,9,10]),frozenset([8,10,11,6,7,3]))
sets1 = set([frozenset([1,2,3]),frozenset([4,5,6]),frozenset([8,10,11])])
sets2 = set([frozenset([8,9,10]),frozenset([6,7,3])])
union_ = set()
for s1 in sets1:
for s2 in sets2:
if s1.isdisjoint(s2):
union_.add(s1 | s2)
print(union_)
{frozenset({3, 6, 7, 8, 10, 11}), frozenset({1, 2, 3, 8, 9, 10}), frozenset({4, 5, 6, 8, 9, 10})}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.