简体   繁体   English

获取不会更有效相交的所有集合的并集

[英]Get Union of all Sets that don't Intersect more efficiently

I would like to get the union of two set of frozensets . 我想得到两组frozensets集的frozensets I'm only interested in the union of frozensets that don't intersect. 我只对不相交的frozensets的联合感兴趣。 Another way to look at it is that I'm only interested in unions that have a length equal to the total length of both frozensets combined. 另一种看待它的方式是,我只对长度等于两个frozensets集合的总长度相等的并集frozensets Ideally I would like to ignore any frozensets that don't intersect with each other for a massive speedup. 理想情况下,我想忽略任何不会相互交叉的frozensets ,从而大大提高了速度。 I expect many frozensets to have at least one element in common. 我希望许多frozensets至少具有一个共同点。 Here is the code I have so far in python. 这是我到目前为止在python中拥有的代码。 I would like it to be as fast as possible as I'm working with a large dataset. 我希望它在处理大型数据集时尽可能快。 Each of the frozensets are no more then 20 elements but there will be somewhere around 1,000 total in a set. 每个frozensets不超过20个元素,但一组中总共约有1,000个元素。 All numbers will be between 0 and 100. I'm open to converting to other types if it would allow my program to run faster but I don't want any repeated elements and order is not important. 所有数字都在0到100之间。如果它允许我的程序运行更快,但我愿意转换为其他类型,但是我不希望任何重复的元素并且顺序并不重要。

sets1 = set([frozenset([1,2,3]),frozenset([4,5,6]),frozenset([8,10,11])])
sets2 = set([frozenset([8,9,10]),frozenset([6,7,3])])
newSets = set()
for fset in sets1:
    for fset2 in sets2:
        newSet = fset.union(fset2)
        if len(newSet) == len(fset)+len(fset2):
            newSets.add(frozenset(newSet))

the correct output is 正确的输出是

set(frozenset([1,2,3,8,9,10]),frozenset([4,5,6,8,9,10]),frozenset([8,10,11,6,7,3]))
sets1 = set([frozenset([1,2,3]),frozenset([4,5,6]),frozenset([8,10,11])])
sets2 = set([frozenset([8,9,10]),frozenset([6,7,3])])




union_ = set()

for s1 in sets1:
    for s2 in sets2:
        if s1.isdisjoint(s2):
            union_.add(s1 | s2)


print(union_)



 {frozenset({3, 6, 7, 8, 10, 11}), frozenset({1, 2, 3, 8, 9, 10}), frozenset({4, 5, 6, 8, 9, 10})}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM