计算集合列表中所有组合的交叉点

Question

I have a collection of sets. 我有一组集合。 I want to find the number of items that are found only in the intersection for each combination of sets. 我想找到每个组合组合中仅在交叉点中找到的项目数。 I'm basically want to do the same thing as creating the numbers in a Venn diagram. 我基本上想做与在维恩图中创建数字相同的事情。

An basic example might make it clearer. 一个基本的例子可能会让它更清晰。

a = set(1,2,5,10,12)
b = set(1,2,6,9,12,15)
c = set(1,2,7,8,15)

I should end up with a count of items found only in: 我最终应该只找到以下项目：

a 一个
b b
c C
the intersection of a and b a和b的交点
the intersection of a and c a和c的交集
the intersection of b and c b和c的交点
the intersection of a, b and c a，b和c的交集

A non-extensible way of doing this is 这是一种不可扩展的方法

num_a = len(a - b - c)  # len(set([5,10])) -> 2
num_b = len(b - a - c)  # len(set([6,9])) -> 2
num_c = len(c - a - b)  # len(set([7,8])) -> 2

num_ab = len((a & b) - c)  # 1
num_ac = len((a & c) - b)  # 0
num_bc = len((b & c) - a)  # 1

num_abc = len(a & b & c)  # 2

While this works for 3 sets my collection of sets is not static. 虽然这适用于3集，但我的集合集并不是静态的。

Answer 1

IIUC, something like this should work: IIUC，这样的事情应该有效：

from itertools import combinations

def venn_count(named_sets):
    names = set(named_sets)
    for i in range(1, len(named_sets)+1):
        for to_intersect in combinations(sorted(named_sets), i):
            others = names.difference(to_intersect)
            intersected = set.intersection(*(named_sets[k] for k in to_intersect))
            unioned = set.union(*(named_sets[k] for k in others)) if others else set()
            yield to_intersect, others, len(intersected - unioned)


ns = {"a": {1,2,5,10,12}, "b": {1,2,6,9,12,15}, "c": {1,2,7,8,15}}
for intersected, unioned, count in venn_count(ns):
    print 'len({}{}) = {}'.format(' & '.join(sorted(intersected)),
                                  ' - ' + ' - '.join(sorted(unioned)) if unioned else '',
                                  count)

which gives 这使

len(a - b - c) = 2
len(b - a - c) = 2
len(c - a - b) = 2
len(a & b - c) = 1
len(a & c - b) = 0
len(b & c - a) = 1
len(a & b & c) = 2

Answer 2

You can use itertools.combinations to get all the possible combinations. 您可以使用itertools.combinations获取所有可能的组合。 http://docs.python.org/2/library/itertools.html http://docs.python.org/2/library/itertools.html

Answer 3

I'd try using bit masks: 我尝试使用位掩码：

sets = [
    set([1,2,5,10,12]),
    set([1,2,6,9,12,15]),
    set([1,2,7,8,15]),
]

d = {}

for n, s in enumerate(sets):
    for i in s:
        d[i] = d.get(i, 0) | (1 << n)

for mask in range(1, 2**len(sets)):
    cnt = sum(1 for x in d.values() if x & mask == mask)
    num = ','.join(str(j) for j in range(len(sets)) if mask & (1 << j))
    print 'number of items in set(s) %s = %d' % (num, cnt)

Results for your input: 您输入的结果：

number of items in set(s) 0 = 5
number of items in set(s) 1 = 6
number of items in set(s) 0,1 = 3
number of items in set(s) 2 = 5
number of items in set(s) 0,2 = 2
number of items in set(s) 1,2 = 3
number of items in set(s) 0,1,2 = 2

计算集合列表中所有组合的交叉点

问题描述

3 个解决方案

解决方案1
3 已采纳 2013-03-21 17:44:38

解决方案2
1 2013-03-21 17:08:10

解决方案3
1 2013-03-21 17:50:24

计算集合列表中所有组合的交叉点

问题描述

3 个解决方案

解决方案1 3 已采纳 2013-03-21 17:44:38

解决方案2 1 2013-03-21 17:08:10

解决方案3 1 2013-03-21 17:50:24

解决方案1
3 已采纳 2013-03-21 17:44:38

解决方案2
1 2013-03-21 17:08:10

解决方案3
1 2013-03-21 17:50:24