Counting intersections for all combinations in a list of sets

Question

I have a collection of sets. I want to find the number of items that are found only in the intersection for each combination of sets. I'm basically want to do the same thing as creating the numbers in a Venn diagram.

An basic example might make it clearer.

a = set(1,2,5,10,12)
b = set(1,2,6,9,12,15)
c = set(1,2,7,8,15)

I should end up with a count of items found only in:

a
b
c
the intersection of a and b
the intersection of a and c
the intersection of b and c
the intersection of a, b and c

A non-extensible way of doing this is

num_a = len(a - b - c)  # len(set([5,10])) -> 2
num_b = len(b - a - c)  # len(set([6,9])) -> 2
num_c = len(c - a - b)  # len(set([7,8])) -> 2

num_ab = len((a & b) - c)  # 1
num_ac = len((a & c) - b)  # 0
num_bc = len((b & c) - a)  # 1

num_abc = len(a & b & c)  # 2

While this works for 3 sets my collection of sets is not static.

Answer 1

IIUC, something like this should work:

from itertools import combinations

def venn_count(named_sets):
    names = set(named_sets)
    for i in range(1, len(named_sets)+1):
        for to_intersect in combinations(sorted(named_sets), i):
            others = names.difference(to_intersect)
            intersected = set.intersection(*(named_sets[k] for k in to_intersect))
            unioned = set.union(*(named_sets[k] for k in others)) if others else set()
            yield to_intersect, others, len(intersected - unioned)


ns = {"a": {1,2,5,10,12}, "b": {1,2,6,9,12,15}, "c": {1,2,7,8,15}}
for intersected, unioned, count in venn_count(ns):
    print 'len({}{}) = {}'.format(' & '.join(sorted(intersected)),
                                  ' - ' + ' - '.join(sorted(unioned)) if unioned else '',
                                  count)

which gives

len(a - b - c) = 2
len(b - a - c) = 2
len(c - a - b) = 2
len(a & b - c) = 1
len(a & c - b) = 0
len(b & c - a) = 1
len(a & b & c) = 2

Answer 2

You can use itertools.combinations to get all the possible combinations. http://docs.python.org/2/library/itertools.html

Answer 3

I'd try using bit masks:

sets = [
    set([1,2,5,10,12]),
    set([1,2,6,9,12,15]),
    set([1,2,7,8,15]),
]

d = {}

for n, s in enumerate(sets):
    for i in s:
        d[i] = d.get(i, 0) | (1 << n)

for mask in range(1, 2**len(sets)):
    cnt = sum(1 for x in d.values() if x & mask == mask)
    num = ','.join(str(j) for j in range(len(sets)) if mask & (1 << j))
    print 'number of items in set(s) %s = %d' % (num, cnt)

Results for your input:

number of items in set(s) 0 = 5
number of items in set(s) 1 = 6
number of items in set(s) 0,1 = 3
number of items in set(s) 2 = 5
number of items in set(s) 0,2 = 2
number of items in set(s) 1,2 = 3
number of items in set(s) 0,1,2 = 2

Counting intersections for all combinations in a list of sets

Question

3 answers

solution1
3 ACCPTED 2013-03-21 17:44:38

solution2
1 2013-03-21 17:08:10

solution3
1 2013-03-21 17:50:24

Counting intersections for all combinations in a list of sets

Question

3 answers

solution1 3 ACCPTED 2013-03-21 17:44:38

solution2 1 2013-03-21 17:08:10

solution3 1 2013-03-21 17:50:24

solution1
3 ACCPTED 2013-03-21 17:44:38

solution2
1 2013-03-21 17:08:10

solution3
1 2013-03-21 17:50:24