Python 无重复组合

Question

我有一个数字列表，我想从中进行组合。 如果我有清单：

t = [2,2,2,2,4]
c = list(itertools.combinations(t, 4))

结果是：

(2, 2, 2, 2)
(2, 2, 2, 4)
(2, 2, 2, 4)
(2, 2, 2, 4)
(2, 2, 2, 4)

但我想得到：

(2, 2, 2, 2)
(2, 2, 2, 4)

除了制作新列表并通过第一个列表之外，是否可以消除重复项？

Answer 1

由于大金刚指向设置，您可以通过将列表转换为集合来获取列表中的唯一值：

t = [2,2,2,2,4]
c = list(itertools.combinations(t, 4))
unq = set(c)
print(unq)

结果将是：

{(2, 2, 2, 4), (2, 2, 2, 2)}

如果要将其用作列表，可以通过执行以下操作将其转换回来：

result = list(unq)

另一种更干净、更全面的方法是：

t = [2,2,2,2,4]
c = set(itertools.combinations(t, 4))

Answer 2

我知道这已经晚了，但我想补充一点。

set(itertools.combinations(t, 4))在大多数情况下会做得很好，但它仍然在内部迭代所有重复的组合，因此计算量可能很大。 如果没有很多实际的独特组合，情况尤其如此。

这个只迭代独特的组合：

from itertools import chain,repeat,count,islice
from collections import Counter

def combinations_without_repetition(r, iterable=None, values=None, counts=None):
    if iterable:
        values, counts = zip(*Counter(iterable).items())

    f = lambda i,c: chain.from_iterable(map(repeat, i, c))
    n = len(counts)
    indices = list(islice(f(count(),counts), r))
    if len(indices) < r:
        return
    while True:
        yield tuple(values[i] for i in indices)
        for i,j in zip(reversed(range(r)), f(reversed(range(n)), reversed(counts))):
            if indices[i] != j:
                break
        else:
            return
        j = indices[i]+1
        for i,j in zip(range(i,r), f(count(j), counts[j:])):
            indices[i] = j

用法：

>>> t = [2,2,2,2,4]
# elements in t must be hashable
>>> list(combinations_without_repetition(4, iterable=t)) 
[(2, 2, 2, 2), (2, 2, 2, 4)]

# You can pass values and counts separately. For this usage, values don't need to be hashable
# Say you have ['a','b','b','c','c','c'], then since there is 1 of 'a', 2 of 'b', and 3 of 'c', you can do as follows:
>>> list(combinations_without_repetition(3, values=['a','b','c'], counts=[1,2,3]))
[('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'c', 'c'), ('b', 'b', 'c'), ('b', 'c', 'c'), ('c', 'c', 'c')]

# combinations_without_repetition() is a generator (and thus an iterator)
# so you can iterate it
>>> for comb in combinations_without_repetition(4, t):
...     print(sum(comb))
...
8   # 2+2+2+2
10  # 2+2+2+4

请注意， itertools.combinations()是用 C 实现的，这意味着在大多数情况下它比我的 python 脚本快得多。 仅当重复组合多于唯一组合时，此代码才比set(itertools.combinations())方法更有效。

Answer 3

从技术上讲，你得到的实际上并不是重复的，这只是itertools.combinations工作方式，如果你阅读链接页面中的描述：

itertools.combinations(iterable, r)

从输入迭代中返回元素的 r 个长度子序列。

组合按字典排序顺序发出。 因此，如果输入可迭代对象已排序，则组合元组将按排序顺序生成。

元素被视为唯一基于它们的位置，而不是它们的值。 因此，如果输入元素是唯一的，则每个组合中都不会出现重复值。

演示：

>>> import itertools as it
>>> list(it.combinations([1,2,3,4,5], 4))
[(1, 2, 3, 4), (1, 2, 3, 5), (1, 2, 4, 5), (1, 3, 4, 5), (2, 3, 4, 5)]

因此，正如在上一个答案中发布的那样， set()将为您提供所需的唯一值：

>>> set(it.combinations(t, 4))
{(2, 2, 2, 4), (2, 2, 2, 2)}

Answer 4

现在可以使用 package more-itertools来完成，从 8.7 版开始，它有一个名为distinct_combinations的 function 来实现这一点。

>>> from itertools import combinations
>>> t = [2,2,2,2,4]
>>> set(combinations(t, 4))
{(2, 2, 2, 2), (2, 2, 2, 4)}

>>> from more_itertools import distinct_combinations
>>> t = [2,2,2,2,4]
>>> list(distinct_combinations(t,4))
(2, 2, 2, 2), (2, 2, 2, 4)]

据我所知，我非常有限的测试性能类似于@hahho编写的 function

Python 无重复组合

问题描述

4 个解决方案

解决方案1
16 已采纳 2016-04-05 14:48:26

解决方案2
15 2017-10-07 17:31:53

解决方案3
8 2016-04-05 15:01:00

解决方案4
7 2022-02-07 20:13:01

Python 无重复组合

问题描述

4 个解决方案

解决方案1 16 已采纳 2016-04-05 14:48:26

解决方案2 15 2017-10-07 17:31:53

解决方案3 8 2016-04-05 15:01:00

解决方案4 7 2022-02-07 20:13:01

解决方案1
16 已采纳 2016-04-05 14:48:26

解决方案2
15 2017-10-07 17:31:53

解决方案3
8 2016-04-05 15:01:00

解决方案4
7 2022-02-07 20:13:01