簡體   English   中英

帶有共享元素的python查找集

[英]python finding sets with shared elements

我的數據是一組凍結集,例如,

data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])

而預期的結果是具有重復元素的Frozenset集,即

result = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]),  frozenset([1,1000, 2000])])

此處frozenset([100,200]) ,因為它不與其他Frozenset共享任何元素。 什么是實現此目的的有效方法?

您可以構建一個set元素的dict ,以對找到它們的次數進行計數,然后刪除其所有元素的計數為1. collections.Counter任何frozenset

這樣的優勢是為O(n) ,其中n是所有集合中元素的總數。

from collections import Counter

data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])
counts = Counter(elt for fs in data for elt in fs)
result = {fs for fs in data if any(counts[elt] > 1 for elt in fs)}

# {frozenset({1, 2, 3, 4}), frozenset({1000, 1, 2000}), frozenset({3, 4, 5, 6, 7, 8})}

我會用這樣的檢查來進行集合理解(對於每個項目,檢查它是否具有與至少一個其他元素相同的元素):

data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])

new_data = {x for x in data if any(not x.isdisjoint(y) for y in data if y!=x)}

print(new_data)

結果:

{frozenset({1, 2, 3, 4}), frozenset({3, 4, 5, 6, 7, 8}), frozenset({1000, 1, 2000})}

可能會有更有效的解決方案,但是至少disjoint部分由有效的set例程處理

這是我的版本,它沒有任何特殊優勢,但是您可能會發現它更具可讀性。

data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])
result = set()

for item in data:
    for element in item:
        for other_item in data:
            if item != other_item and item not in result:
                if element in other_item:
                    result.add(item)
                    break
>>>print(result)
>>>{frozenset({1, 2, 3, 4}), frozenset({1000, 1, 2000}), frozenset({3, 4, 5, 6, 7, 8})}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM