I have a collection of sets. I want to find the number of items that are found only in the intersection for each combination of sets. I'm basically want to do the same thing as creating the numbers in a Venn diagram.
An basic example might make it clearer.
a = set(1,2,5,10,12)
b = set(1,2,6,9,12,15)
c = set(1,2,7,8,15)
I should end up with a count of items found only in:
A non-extensible way of doing this is
num_a = len(a - b - c) # len(set([5,10])) -> 2
num_b = len(b - a - c) # len(set([6,9])) -> 2
num_c = len(c - a - b) # len(set([7,8])) -> 2
num_ab = len((a & b) - c) # 1
num_ac = len((a & c) - b) # 0
num_bc = len((b & c) - a) # 1
num_abc = len(a & b & c) # 2
While this works for 3 sets my collection of sets is not static.
IIUC, something like this should work:
from itertools import combinations
def venn_count(named_sets):
names = set(named_sets)
for i in range(1, len(named_sets)+1):
for to_intersect in combinations(sorted(named_sets), i):
others = names.difference(to_intersect)
intersected = set.intersection(*(named_sets[k] for k in to_intersect))
unioned = set.union(*(named_sets[k] for k in others)) if others else set()
yield to_intersect, others, len(intersected - unioned)
ns = {"a": {1,2,5,10,12}, "b": {1,2,6,9,12,15}, "c": {1,2,7,8,15}}
for intersected, unioned, count in venn_count(ns):
print 'len({}{}) = {}'.format(' & '.join(sorted(intersected)),
' - ' + ' - '.join(sorted(unioned)) if unioned else '',
count)
which gives
len(a - b - c) = 2
len(b - a - c) = 2
len(c - a - b) = 2
len(a & b - c) = 1
len(a & c - b) = 0
len(b & c - a) = 1
len(a & b & c) = 2
You can use itertools.combinations
to get all the possible combinations. http://docs.python.org/2/library/itertools.html
I'd try using bit masks:
sets = [
set([1,2,5,10,12]),
set([1,2,6,9,12,15]),
set([1,2,7,8,15]),
]
d = {}
for n, s in enumerate(sets):
for i in s:
d[i] = d.get(i, 0) | (1 << n)
for mask in range(1, 2**len(sets)):
cnt = sum(1 for x in d.values() if x & mask == mask)
num = ','.join(str(j) for j in range(len(sets)) if mask & (1 << j))
print 'number of items in set(s) %s = %d' % (num, cnt)
Results for your input:
number of items in set(s) 0 = 5
number of items in set(s) 1 = 6
number of items in set(s) 0,1 = 3
number of items in set(s) 2 = 5
number of items in set(s) 0,2 = 2
number of items in set(s) 1,2 = 3
number of items in set(s) 0,1,2 = 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.