简体   繁体   中英

List all combinations of a items into b groups of size c

I'm looking for a way in Python to split an arbitrary number of items into an arbitrary number of even groups, and to obtain the list/array of all these splits.

So for example, given 12 items, there are 5775 ways of grouping them into 3 groups of 4. Calculating this is not an issue, but I can't seem to find a way to return a list or array of these 5775. I can get the first groups using:

import itertools
list(itertools.combinations(range(12), 4))

But how can I obtain the remaining groups from this?

The desired output for a = 4 , b = 2 , c = 2 would be:

[[[1, 2], [3, 4]],
 [[1, 3], [2, 4]],
 [[1, 4], [2, 3]]]

And for a = 3 , b = 3 , c = 1 :

[[[1], [2], [3]]]

You can use a recursive generator that, at each level or recursion, computes all combinations of the (remaining) items. The next level of recursion receives only those items that have been not been used in the current level already (the remainder ).

In order to prevent duplicates in terms of the group ordering we need to truncate the output of it.combinations such that it doesn't yield a combination that has appeared in the remainder of a previous iteration before. Let n be the number of items and g the size of each group. Then the first item from it.combinations is (0, 1, ..., g-1) (in terms of the indices). The item (1, 2, ..., g) will be part of the remainder when the current item from it.combinations is (0, g, g+1, ..., 2*g-1) (this assumes n % g == 0 ). Hence, we need to truncate the output of it.combinations such that the first element is fixed ( 0 in the above example). Because it.combinations produces the items in lexicographical order, this covers the first (n-1)! / (ng)! / (g-1)! (n-1)! / (ng)! / (g-1)! items ( ! denotes the factorial).

The following is an example implementation:

import itertools as it
from math import factorial
from typing import Iterator, Sequence, Tuple, TypeVar


T = TypeVar('T')


def group_items(items: Sequence[T], group_size: int) -> Iterator[Tuple[Tuple[T, ...], ...]]:
    if len(items) % group_size != 0:
        raise ValueError(
            f'Number of items is not a multiple of the group size '
            f'({len(items)} and {group_size})'
        )
    elif len(items) == group_size:
        yield (tuple(items),)
    elif items:
        count, _r = divmod(
            factorial(len(items) - 1),
            factorial(len(items) - group_size) * factorial(group_size - 1)
        )
        assert _r == 0
        for group in it.islice(it.combinations(items, group_size), count):
            remainder = [x for x in items if x not in group]  # maintain order
            yield from (
                (group, *others)
                for others in group_items(remainder, group_size)
            )


result = list(group_items(range(12), 4))
print(len(result))

from pprint import pprint
pprint(result[:3])
pprint(result[-3:])

Note that the above example uses remainder = [x for x in items if x not in group] to compute what items should go to the next level of recursion. This might be inefficient if your group size is large. Instead you could also use a set (if your items are hashable). Also, if equality comparison ( == ) between your items is expensive, it would be better to work with indices rather then with the items and compute the group and remainder from these indices. I didn't include these aspects in the above code snippet in order to keep it simple, but if you are interested in the details, I can expand my answer.

Not sure if there's a smarter or more concise way, but you can create a recursive function to pick combinations for the first list, then pick combinations from the items not yet used. Also, if order of both the items in the sublists and the sublists themselves does not seem to matter, that means that the first sublist will always starts with the smallest element (otherwise it would not be the first sublist), the second starts with the smallest of the remaining items, etc. This should cut down on the number of combinations and prevent any duplicate results from appearing.

from itertools import combinations

def split(items, b, c):
    assert len(items) == b * c
    def _inner(remaining, groups):
        if len(groups) == b:
            yield groups
        else:
            first, *rest = (x for x in remaining if not groups or x not in groups[-1])
            for comb in combinations(rest, c-1):
                yield from _inner(rest, groups + [{first, *comb}])
    return _inner(items, [])

for x in split(list(range(6)), 2, 3):
    print(x)

Sample Output (using lists of sets, but you may convert the sublists to list before yielding):

[{0, 1, 2}, {3, 4, 5}]
[{0, 1, 3}, {2, 4, 5}]
[{0, 1, 4}, {2, 3, 5}]
[{0, 1, 5}, {2, 3, 4}]
[{0, 2, 3}, {1, 4, 5}]
[{0, 2, 4}, {1, 3, 5}]
[{0, 2, 5}, {1, 3, 4}]
[{0, 3, 4}, {1, 2, 5}]
[{0, 3, 5}, {1, 2, 4}]
[{0, 4, 5}, {1, 2, 3}]

For (a,b,c) = (12, 3, 4) it yields 5775 elements, as expected. For longer lists, this will still take a lot of time, though.

Use the set_partitions() function in more-itertools package:

# pip install more-itertools
from more_itertools import set_partitions
a, b, c = 12, 3, 4

results = []
for part in set_partitions(range(a), b):
    if all([len(p) == c for p in part]):
        results.append(part)

print(len(results))  # 5775

Parts of 5775 results:

...
[[2, 4, 5, 9], [0, 1, 7, 10], [3, 6, 8, 11]]
[[1, 4, 5, 9], [0, 2, 7, 10], [3, 6, 8, 11]]
[[0, 4, 5, 9], [1, 2, 7, 10], [3, 6, 8, 11]]
[[2, 3, 5, 9], [1, 4, 7, 10], [0, 6, 8, 11]]
[[2, 3, 5, 9], [0, 4, 7, 10], [1, 6, 8, 11]]
...

In case you wanna know what does it do, basically set_partitions(range(4), 2) yields the set partitions of [0, 1, 2, 3] into 2 parts:

[[0], [1, 2, 3]], 
[[0, 1], [2, 3]], 
[[1], [0, 2, 3]], 
[[0, 1, 2], [3]], 
[[1, 2], [0, 3]], 
[[0, 2], [1, 3]], 
[[2], [0, 1, 3]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM