繁体   English   中英

对列表中的元素进行分组,以便每个种子产生唯一的组合

[英]Grouping elements in a list such that each seed produces unique combination

我有一个独特的问题。 让我们考虑一下我们有一个元素列表 [1,2,3,4,5,6]。 我需要 select 某些元素集以根据池大小形成池。 即,如果池大小为 3,而我的列表大小为 6,则总共有 6C3 种可能的组合。 这可以通过随机样本来完成

但这里有一个问题,假设我有一个更大的列表,我必须将列表中的所有成员分组,以便所有成员都出现在一个组中进行一次迭代(让我们称之为种子)。 现在对于下一个种子,我将再次将元素分组为不同的组合,但我得到的组合必须不同于我在前一个种子中得到的组合。

示例:元素是 [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18] 现在我的池大小是 3 我的种子 3

seed1-'[[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]'

seed2-'[[1,5,3],[4,2,6],[7,11,9],[10,8,12],[13,17,15],[16,14,18]]'

seed3-'[[1,5,6],[4,2,3],[7,11,12],[10,8,9],[13,17,18],[16,14,15]]'

请注意,所有元素必须在种子中出现一次,并且任何种子中的任何组合都不得重复。 对于一小组元素来说,它可能看起来很简单,但我试图在 300 的元素大小和 500 个种子上实现这一点

如果您生成的种子数量没有接近耗尽可能的组合,您可以做的是对整个数字列表进行洗牌,然后将其分解为池大小的块。 然后跟踪到目前为止使用的池,并在发生冲突时生成另一个 shuffle。

这可以在生成器 function 中完成,这样您就无需预先确定要生成的种子数量:

from random import sample
def seedPools(elements,poolSize):
    used = set()
    while True:
        shuffled = sample(elements,len(elements))
        seed     = [ tuple(shuffled[i:i+poolSize]) 
                     for i in range(0,len(elements),poolSize)]
        if any(pool in used for pool in seed): continue
        yield seed
        used.update(seed)

样品 output:

seedGen = seedPools(list(range(1,19)),3)
for _ in range(10): print(next(seedGen))
                 
    
[(9, 13, 18), (10, 15, 12), (14, 16, 2), (3, 1, 17), (4, 5, 8), (6, 11, 7)]
[(16, 7, 6), (14, 12, 5), (15, 4, 17), (3, 10, 8), (9, 11, 2), (13, 18, 1)]
[(6, 8, 7), (10, 9, 15), (2, 1, 14), (17, 18, 12), (11, 3, 4), (5, 16, 13)]
[(12, 1, 3), (6, 5, 16), (2, 14, 9), (7, 8, 15), (10, 13, 11), (17, 4, 18)]
[(7, 14, 12), (4, 10, 13), (9, 17, 5), (16, 3, 2), (1, 11, 18), (8, 15, 6)]
[(18, 13, 10), (12, 1, 14), (8, 6, 15), (3, 2, 5), (16, 9, 11), (17, 4, 7)]
[(6, 5, 12), (8, 2, 13), (1, 15, 14), (17, 10, 7), (3, 11, 4), (16, 9, 18)]
[(14, 6, 1), (11, 5, 18), (12, 10, 16), (8, 13, 17), (15, 9, 2), (4, 7, 3)]
[(10, 16, 11), (1, 15, 5), (4, 3, 12), (6, 14, 2), (17, 18, 9), (7, 8, 13)]
[(11, 10, 1), (12, 8, 14), (17, 13, 6), (18, 7, 5), (9, 15, 4), (3, 16, 2)]

要从 300 个项目生成 500 个种子,您可以使用列表推导(或者您可以向生成器函数添加参数)

seedGen  = seedPools(list(range(1,301)),3)
bigSeeds = [ next(seedGen) for _ in range(500) ] # 0.14 second on my computer

注意:我假设您不希望任何池在种子中重复。 如果允许池重复,那么您只需要检查if tuple(seed) in used: continue尝试另一个 shuffle,当然在产生唯一种子后记录整个种子而不是used集合中的单个池: used.add(tuple(seed))

[编辑] 这是 function 调整为将额外值传播到随机池,以便所有池至少具有请求的池大小:

from random import sample
def seedPools(elements,poolSize,seedCount):
    used = set()
    while seedCount > 0:
        shuffled = sample(elements,len(elements))
        seed     = [ tuple(shuffled[i:i+poolSize]) 
                     for i in range(0,len(elements),poolSize)]
        if len(seed[-1])<poolSize:
            lastPool = seed.pop(-1)
            spread   = sample(range(len(seed)),len(lastPool))
            for p,v in zip(spread,lastPool):
                seed[p] += (v,)
        if any(pool in used for pool in seed): continue
        yield seed
        used.update(seed)
        seedCount -= 1

for seed in seedPools(list(range(1,20)),3,10): print(seed)

[(14, 11, 18), (2, 12, 9), (8, 3, 4), (7, 6, 13), (16, 17, 15), (1, 19, 10, 5)]
[(9, 15, 11), (1, 3, 2), (14, 10, 19), (8, 5, 12, 7), (6, 16, 4), (18, 17, 13)]
[(19, 10, 18), (5, 9, 2), (4, 1, 6, 16), (13, 11, 15), (12, 8, 17), (14, 7, 3)]
[(1, 5, 15, 12), (2, 6, 14), (18, 8, 11), (16, 13, 19), (3, 17, 4), (10, 7, 9)]
[(7, 14, 8), (12, 18, 6, 2), (17, 9, 16), (15, 5, 3), (13, 11, 10), (4, 19, 1)]
[(15, 6, 17), (2, 10, 3), (7, 19, 16, 9), (4, 11, 5), (8, 13, 12), (14, 1, 18)]
[(1, 16, 2), (19, 14, 18), (12, 17, 5, 7), (13, 11, 4), (9, 6, 3), (15, 10, 8)]
[(2, 17, 19), (8, 13, 5, 18), (7, 11, 1), (16, 15, 6), (14, 12, 4), (10, 3, 9)]
[(3, 12, 7, 15), (4, 11, 2), (17, 1, 14), (19, 10, 6), (13, 9, 5), (18, 8, 16)]
[(13, 15, 12), (9, 1, 18), (5, 3, 10), (16, 4, 14), (11, 17, 7), (19, 2, 6, 8)]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM