简体   繁体   English

两组集合的并集

[英]Union of Two Set of Sets

I would like to get the union of two set of sets. 我想得到两组的并集。 Here is the code I have so far in python. 这是我到目前为止在python中拥有的代码。 I would like it to be as fast as possible as I'm working with a large dataset. 我希望它在处理大型数据集时尽可能快。 Each of the frozensets are no more then 20 elements but there will be somewhere around 50,000 total in a set. 每个冻结集不超过20个元素,但一组中总共约有50,000个元素。 All numbers will be between 0 and 100. I expect a fair amount of frozensets that are already in both sets or that when merged will already exist in one of the sets. 所有数字都将在0到100之间。我希望两个集合中已经有相当数量的冻结集,或者合并后的冻结集将存在于其中一个集合中。 I'm open to converting to other types if it would allow my program to run faster but I don't want any repeated elements and order is not important. 如果它允许我的程序运行更快,但我愿意转换为其他类型,但是我不希望任何重复的元素并且顺序并不重要。

sets1 = set([frozenset([1,2,3]),frozenset([4,5,6])])
sets2 = set([frozenset([8,9,10]),frozenset([6,7,3])])
newSets = set()
for fset in sets1:
    for fset2 in sets2:
        newS = set(fset)
        newS.update(fset2)
        newSets.add(frozenset(newS))

the correct output is set([1,2,3,8,9,10],[1,2,3,6,7],[3,4,5,6,7],[4,5,6,8,9,10]) 设置正确的输出([1,2,3,8,9,10],[1,2,3,6,7],[3,4,5,6,7],[4,5,6 ,8,9,10])

You can avoid the temporary set instance and the conversion to a frozenset , by directly "or"ing the frozenset s: 您可以通过直接“或” frozenset来避免临时set实例和转换为frozenset

newSets = set()
for fset in sets1:
    for fset2 in sets2:
        newSets.add(fset | fset2)

Further (slight) speedup can be achieved by using set-comprehension: 通过使用set-comprehension,可以进一步(略)加快速度:

newSets = { fset|fset2  for fset in sets1  for fset2 in sets2 }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM