简体   繁体   English

生成所有固定长度组合组

[英]Producing all groups of fixed-length combinations

I'm looking for an algorithm and/or Python code to generate all possible ways of partitioning a set of n elements into zero or more groups of r elements and a remainder. 我正在寻找一种算法和/或Python代码来生成将一组n元素分成零个或多个r元素组和余数的所有可能方法。 For example, given a set: 例如,给定一组:

[1,2,3,4,5]

with n = 5 and r = 2 , I would like to get n = 5r = 2 ,我想得到

((1,2,3,4,5),)
((1,2),(3,4,5))
((1,3),(2,4,5))
...
((1,2),(3,4),(5,))
((1,2),(3,5),(4,))
...

In other words, the result of extracting 0 groups of two items from the set, plus the results of extracting 1 group of two items from the set, plus the results of extracting 2 groups of two from the set,... if n were larger, this would continue. 换句话说,从集合中提取0组的两个项目的结果,加上从集合中提取1组两个项目的结果,加上从集合中提取2组2的结果,...如果n是更大,这将继续下去。

The order in which these results are generated is not important, and neither is the order of elements within each individual group, nor the order of the groups within a result. 生成这些结果的顺序并不重要,也不是每个单独组中元素的顺序,也不是结果中组的顺序。 (eg ((1,3),(2,4,5)) is equivalent to ((3,1),(4,5,2)) and to ((2,5,4),(1,3)) and so on.) What I'm looking for is that every distinct result is produced at least once, and preferably exactly once, in as efficient a manner as possible. (例如((1,3),(2,4,5))相当于((3,1),(4,5,2))((2,5,4),(1,3))等等。)我正在寻找的是,每个不同的结果至少产生一次,最好是恰好一次,以尽可能有效的方式产生。


The brute force method is to generate all possible combinations of r out of the n elements, then create all possible groups of any number of those combinations (the powerset ), iterate over them and only process the ones where the combinations in the group have no elements in common. 蛮力方法是生成n元素中r所有可能组合,然后创建任意数量的那些组合( powerset )的所有可能组,迭代它们并且仅处理组中组合没有的组合共同的元素。 That takes far too long for even a small number of elements (it requires iterating over 2^(n!/r!(nr)!) groups, so the complexity is double-exponential). 这需要时间太长了,甚至有少数的元素(它需要迭代超过2 ^(N!/ R!(NR)!)组,所以复杂性是双指数)。

Based on the code given in this question , which is essentially the special case for r = 2 and n even, I've come up with the following: 基于这个问题中给出的代码,这基本上是r = 2n even的特例,我想出了以下内容:

def distinct_combination_groups(iterable, r):
    tpl = tuple(iterable)
    yield (tpl,)
    if len(tpl) > r:
        for c in combinations(tpl, r):
            for g in distinct_combination_groups(set(tpl) - set(c), r):
                yield ((c,) + g)

which does seem to generate all possible results, but it includes some duplicates, a nontrivial number of them when n is fairly large. 这似乎确实产生了所有可能的结果,但它包括一些重复,当n相当大时,它们是非常重要的。 So I'm hoping for an algorithm that will avoid the duplicates. 所以我希望能够避免重复的算法。

How about this? 这个怎么样?

from itertools import combinations

def partitions(s, r):
    """
    Generate partitions of the iterable `s` into subsets of size `r`.

    >>> list(partitions(set(range(4)), 2))
    [((0, 1), (2, 3)), ((0, 2), (1, 3)), ((0, 3), (1, 2))]
    """
    s = set(s)
    assert(len(s) % r == 0)
    if len(s) == 0:
        yield ()
        return
    first = next(iter(s))
    rest = s.difference((first,))
    for c in combinations(rest, r - 1):
        first_subset = (first,) + c
        for p in partitions(rest.difference(c), r):
            yield (first_subset,) + p

def partitions_with_remainder(s, r):
    """
    Generate partitions of the iterable `s` into subsets of size
    `r` plus a remainder.

    >>> list(partitions_with_remainder(range(4), 2))
    [((0, 1, 2, 3),), ((0, 1), (2, 3)), ((0, 2), (1, 3)), ((0, 3), (1, 2))]
    >>> list(partitions_with_remainder(range(3), 2))
    [((0, 1, 2),), ((1, 2), (0,)), ((0, 2), (1,)), ((0, 1), (2,))]
    """
    s = set(s)
    for n in xrange(len(s), -1, -r): # n is size of remainder.
        if n == 0:
            for p in partitions(s, r):
                yield p
        elif n != r:
            for remainder in combinations(s, n):
                for p in partitions(s.difference(remainder), r):
                    yield p + (remainder,)

The example from the OP: OP的例子:

>>> pprint(list(partitions_with_remainder(range(1, 6), 2)))
[((1, 2, 3, 4, 5),),
 ((4, 5), (1, 2, 3)),
 ((3, 5), (1, 2, 4)),
 ((3, 4), (1, 2, 5)),
 ((2, 5), (1, 3, 4)),
 ((2, 4), (1, 3, 5)),
 ((2, 3), (1, 4, 5)),
 ((1, 5), (2, 3, 4)),
 ((1, 4), (2, 3, 5)),
 ((1, 3), (2, 4, 5)),
 ((1, 2), (3, 4, 5)),
 ((2, 3), (4, 5), (1,)),
 ((2, 4), (3, 5), (1,)),
 ((2, 5), (3, 4), (1,)),
 ((1, 3), (4, 5), (2,)),
 ((1, 4), (3, 5), (2,)),
 ((1, 5), (3, 4), (2,)),
 ((1, 2), (4, 5), (3,)),
 ((1, 4), (2, 5), (3,)),
 ((1, 5), (2, 4), (3,)),
 ((1, 2), (3, 5), (4,)),
 ((1, 3), (2, 5), (4,)),
 ((1, 5), (2, 3), (4,)),
 ((1, 2), (3, 4), (5,)),
 ((1, 3), (2, 4), (5,)),
 ((1, 4), (2, 3), (5,))]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM