[英]python finding sets with shared elements
我的數據是一組凍結集,例如,
data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])
而預期的結果是具有重復元素的Frozenset集,即
result = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([1,1000, 2000])])
此處frozenset([100,200])
,因為它不與其他Frozenset共享任何元素。 什么是實現此目的的有效方法?
您可以構建一個set元素的dict
,以對找到它們的次數進行計數,然后刪除其所有元素的計數為1. collections.Counter
任何frozenset
。
這樣的優勢是為O(n)
,其中n
是所有集合中元素的總數。
from collections import Counter
data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])
counts = Counter(elt for fs in data for elt in fs)
result = {fs for fs in data if any(counts[elt] > 1 for elt in fs)}
# {frozenset({1, 2, 3, 4}), frozenset({1000, 1, 2000}), frozenset({3, 4, 5, 6, 7, 8})}
我會用這樣的檢查來進行集合理解(對於每個項目,檢查它是否具有與至少一個其他元素相同的元素):
data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])
new_data = {x for x in data if any(not x.isdisjoint(y) for y in data if y!=x)}
print(new_data)
結果:
{frozenset({1, 2, 3, 4}), frozenset({3, 4, 5, 6, 7, 8}), frozenset({1000, 1, 2000})}
可能會有更有效的解決方案,但是至少disjoint
部分由有效的set
例程處理
這是我的版本,它沒有任何特殊優勢,但是您可能會發現它更具可讀性。
data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])
result = set()
for item in data:
for element in item:
for other_item in data:
if item != other_item and item not in result:
if element in other_item:
result.add(item)
break
>>>print(result)
>>>{frozenset({1, 2, 3, 4}), frozenset({1000, 1, 2000}), frozenset({3, 4, 5, 6, 7, 8})}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.