简体   繁体   English

使用特定标准的列表中项目的组合

[英]Combinations of items in a list using specific critera

I am trying to find specific combinations of items in a list.我正在尝试在列表中查找特定的项目组合。 The list is made up of x groups that repeat y times.该列表由重复 y 次的 x 组组成。 In this example x and y = 3, but in practice could be much bigger in size.在这个例子中,x 和 y = 3,但实际上可能更大。 I want to find every combination of groups and y, but without duplicating an x value for a given combination.我想找到组和 y 的每个组合,但不复制给定组合的 x 值。 I think it's easier to just show an example of what I want.我认为只展示一个我想要的例子更容易。

Here is an example.这是一个例子。

A = ['ST1_0.245', 'ST1_0.29', 'ST1_0.335', 'ST2_0.245', 'ST2_0.29', 'ST2_0.335', 'ST3_0.245', 'ST3_0.29', 'ST3_0.335']

So three groups, ST1, ST2, and ST3 – each having 3 iterations, 0.245, 0.290, and 0.335.所以三个组,ST1、ST2 和 ST3——每个都有 3 次迭代,0.245、0.290 和 0.335。

I want to find the following combinations.我想找到以下组合。

('ST1_0.245', 'ST2_0.245', 'ST3_0.245')
('ST1_0.245', 'ST2_0.245', 'ST3_0.29')
('ST1_0.245', 'ST2_0.245', 'ST3_0.335')
('ST1_0.245', 'ST2_0.29', 'ST3_0.245')
('ST1_0.245', 'ST2_0.29', 'ST3_0.29')
('ST1_0.245', 'ST2_0.29', 'ST3_0.335')
('ST1_0.245', 'ST2_0.335', 'ST3_0.245')
('ST1_0.245', 'ST2_0.335', 'ST3_0.29')
('ST1_0.245', 'ST2_0.335', 'ST3_0.335')
('ST1_0.29', 'ST2_0.245', 'ST3_0.245')
('ST1_0.29', 'ST2_0.245', 'ST3_0.29')
('ST1_0.29', 'ST2_0.245', 'ST3_0.335')
('ST1_0.29', 'ST2_0.29', 'ST3_0.245')
('ST1_0.29', 'ST2_0.29', 'ST3_0.29')
('ST1_0.29', 'ST2_0.29', 'ST3_0.335')
('ST1_0.29', 'ST2_0.335', 'ST3_0.245')
('ST1_0.29', 'ST2_0.335', 'ST3_0.29')
('ST1_0.29', 'ST2_0.335', 'ST3_0.335')
('ST1_0.335', 'ST2_0.245', 'ST3_0.245')
('ST1_0.335', 'ST2_0.245', 'ST3_0.29')
('ST1_0.335', 'ST2_0.245', 'ST3_0.335')
('ST1_0.335', 'ST2_0.29', 'ST3_0.245')
('ST1_0.335', 'ST2_0.29', 'ST3_0.29')
('ST1_0.335', 'ST2_0.29', 'ST3_0.335')
('ST1_0.335', 'ST2_0.335', 'ST3_0.245')
('ST1_0.335', 'ST2_0.335', 'ST3_0.29')
('ST1_0.335', 'ST2_0.335', 'ST3_0.335')

Note that ST1, ST2, and ST3 are only in each combination once.请注意,ST1、ST2 和 ST3 仅在每个组合中出现一次。

Here is code that I got to work at least for small cases.这是我至少在小案例中工作的代码。

import itertools
import numpy as np

comb = []
gr_list=['ST1','ST2','ST3']
for itr in itertools.combinations(A, len(gr_list)):
    # pdb.set_trace()
    for n in np.arange(len(gr_list)):
        if sum(itr[n].split('_')[0] in s for s in itr) > 1:
            break
    
    if n == len(gr_list)-1:
        comb.append(itr)

This works for a few small examples I tested, but when I tried larger values, I was getting more results than I thought, but that could be a mistake in my attempts to calculate how many are expected.这适用于我测试的几个小例子,但是当我尝试更大的值时,我得到的结果比我想象的要多,但这可能是我尝试计算预期数量时的错误。 But either way, it takes far too long.但无论哪种方式,都需要太长时间。 Is there a faster way to do this?有没有更快的方法来做到这一点?

I do have both the values separately to begin with.我确实分别拥有这两个值。 Which as I write this I am feeling like thats a better way to approach it, but I am not sure how to do that either.在我写这篇文章时,我觉得这是一种更好的方法来处理它,但我也不知道该怎么做。

You can use itertools.product for this, which will produce an iterator rather than a list (which will generally be more efficient if you're iterating through rather than producing the whole collection).您可以为此使用itertools.product ,它将生成一个迭代器而不是一个列表(如果您是迭代而不是生成整个集合,这通常会更有效)。 You're going to end up with the product of the lengths of the different categories as the number of elements in the iterator.您最终将得到不同类别长度的乘积作为迭代器中元素的数量。

Create the groups as required and then use itertools.product on the groups:根据需要创建组,然后在组上使用itertools.product

A = ['ST1_0.245', 'ST1_0.29', 'ST1_0.335', 
     'ST2_0.245', 'ST2_0.29', 'ST2_0.335', 
     'ST3_0.245', 'ST3_0.29', 'ST3_0.335']

prefixes = set(s.split("_")[0] for s in A)
groups = [[a for a in A if a.split("_")[0]==p] for p in prefixes]

>>> list(itertools.product(*groups))

[('ST2_0.245', 'ST3_0.245', 'ST1_0.245'),
 ('ST2_0.245', 'ST3_0.245', 'ST1_0.29'),
 ('ST2_0.245', 'ST3_0.245', 'ST1_0.335'),
 ('ST2_0.245', 'ST3_0.29', 'ST1_0.245'),
 ('ST2_0.245', 'ST3_0.29', 'ST1_0.29'),
 ('ST2_0.245', 'ST3_0.29', 'ST1_0.335'),
 ('ST2_0.245', 'ST3_0.335', 'ST1_0.245'),
 ('ST2_0.245', 'ST3_0.335', 'ST1_0.29'),
 ('ST2_0.245', 'ST3_0.335', 'ST1_0.335'),
 ('ST2_0.29', 'ST3_0.245', 'ST1_0.245'),
 ('ST2_0.29', 'ST3_0.245', 'ST1_0.29'),
 ('ST2_0.29', 'ST3_0.245', 'ST1_0.335'),
 ('ST2_0.29', 'ST3_0.29', 'ST1_0.245'),
 ('ST2_0.29', 'ST3_0.29', 'ST1_0.29'),
 ('ST2_0.29', 'ST3_0.29', 'ST1_0.335'),
 ('ST2_0.29', 'ST3_0.335', 'ST1_0.245'),
 ('ST2_0.29', 'ST3_0.335', 'ST1_0.29'),
 ('ST2_0.29', 'ST3_0.335', 'ST1_0.335'),
 ('ST2_0.335', 'ST3_0.245', 'ST1_0.245'),
 ('ST2_0.335', 'ST3_0.245', 'ST1_0.29'),
 ('ST2_0.335', 'ST3_0.245', 'ST1_0.335'),
 ('ST2_0.335', 'ST3_0.29', 'ST1_0.245'),
 ('ST2_0.335', 'ST3_0.29', 'ST1_0.29'),
 ('ST2_0.335', 'ST3_0.29', 'ST1_0.335'),
 ('ST2_0.335', 'ST3_0.335', 'ST1_0.245'),
 ('ST2_0.335', 'ST3_0.335', 'ST1_0.29'),
 ('ST2_0.335', 'ST3_0.335', 'ST1_0.335')]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM