随机分区列表，没有重复项

Question

我有一个数组，其中包含n次每个数字集。 n=2示例：

[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]

我想要的是此数组的一个分区，其中该分区的成员

包含从数组中随机抽取的元素
不含重复
包含相同数量的元素（最多舍入） k

k=4示例输出：

[[3,0,2,1], [0,1,4,2], [3,4]]

对于k=4无效的输出：

[[3,0,2,2], [3,1,4,0], [1,4]]

（这是一个分区，但是该分区的第一个元素包含重复项）

实现这一目标的最有效方式是什么？

Answer 1

可以使用collections.Counter和random.sample组合：

from collections import Counter
import random

def random_partition(seq, k):
    cnts = Counter(seq)
    # as long as there are enough items to "sample" take a random sample
    while len(cnts) >= k:
        sample = random.sample(list(cnts), k)
        cnts -= Counter(sample)
        yield sample

    # Fewer different items than the sample size, just return the unique
    # items until the Counter is empty
    while cnts:
        sample = list(cnts)
        cnts -= Counter(sample)
        yield sample

这是一个发生器yield S中的样本，这样你就可以简单地将其转换为一个list ：

>>> l = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4]

>>> list(random_partition(l, 4))
[[1, 0, 2, 4], [1, 0, 2, 3], [3, 4]]

>>> list(random_partition(l, 2))
[[1, 0], [3, 0], [1, 4], [2, 3], [4, 2]]

>>> list(random_partition(l, 6))
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]

>>> list(random_partition(l, 4))
[[4, 1, 0, 3], [1, 3, 4, 0], [2], [2]]

最后一种情况表明，如果函数中的“随机”部分返回“错误”样本，则此方法可能会产生奇怪的结果。 如果这不应该发生，或者至少不经常发生，则需要弄清楚如何对样本进行加权（例如，使用random.choices ），以最大程度地降低这种可能性。

随机分区列表，没有重复项

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-04-12 00:38:15

随机分区列表，没有重复项

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-04-12 00:38:15

解决方案1
2 已采纳 2017-04-12 00:38:15