简体   繁体   English

Python中有序子集的有效枚举

[英]Efficient enumeration of ordered subsets in Python

I'm not sure of the appropriate mathematical terminology for the code I'm trying to write. 我不确定要尝试编写的代码是否适用数学术语。 I'd like to generate combinations of unique integers, where "ordered subsets" of each combination are used to exclude certain later combinations. 我想生成唯一整数的组合,其中每个组合的“有序子集”用于排除某些稍后的组合。

Hopefully an example will make this clear: 希望有一个例子可以使这一点变得清楚:

from itertools import chain, combinations
​
mylist = range(4)
max_depth = 3

rev = chain.from_iterable(combinations(mylist, i) for i in xrange(max_depth, 0, -1))
for el in list(rev):
    print el

That code results in output that contains all the subsets I want, but also some extra ones that I do not. 该代码产生的输出包含我想要的所有子集,但也包含一些我不需要的子集。 I have manually inserted comments to indicate which elements I don't want. 我已手动插入注释以指示不需要的元素。

(0, 1, 2)
(0, 1, 3)
(0, 2, 3)
(1, 2, 3)
(0, 1)  # Exclude: (0, 1, _) occurs as part of (0, 1, 2) above
(0, 2)  # Exclude: (0, 2, _) occurs above
(0, 3)  # Keep
(1, 2)  # Exclude: (1, 2, _) occurs above
(1, 3)  # Keep: (_, 1, 3) occurs above, but (1, 3, _) does not
(2, 3)  # Keep
(0,)    # Exclude: (0, _, _) occurs above
(1,)    # Exclude: (1, _, _) occurs above
(2,)    # Exclude: (2, _) occurs above
(3,)    # Keep

Thus, the desired output of my generator or iterator would be: 因此,生成器或迭代器的期望输出为:

(0, 1, 2)
(0, 1, 3)
(0, 2, 3)
(1, 2, 3)
(0, 3)
(1, 3)
(2, 3)
(3,)  

I know I could make a list of all the (wanted and unwanted) combinations and then filter out the ones I don't want, but I was wondering if there was a more efficient, generator or iterator based way. 我知道我可以列出所有(想要的和不需要的)组合,然后过滤掉我不想要的组合,但是我想知道是否有更有效的基于生成器或迭代器的方式。

You are trying to exclude any combination that is a prefix of a previously-returned combination. 您试图排除作为先前返回的组合的前缀的任何组合。 Doing so is straightforward. 这样做很简单。

  • If a tuple t has length max_depth , it can't be a prefix of a previously-returned tuple, since any tuple it's a prefix of would have to be longer. 如果元组t长度为max_depth ,则它不能是先前返回的元组的前缀,因为作为它的前缀的任何元组都必须更长。
  • If a tuple t ends with mylist[-1] , then it can't be a prefix of a previously-returned tuple, since there are no elements that could legally be added to the end of t to extend it. 如果元组tmylist[-1]结尾,则它不能是先前返回的元组的前缀,因为没有合法地添加到t的末尾以扩展它的元素。
  • If a tuple t has length less than max_depth and does not end with mylist[-1] , then t is a prefix of the previously-returned tuple t + (mylist[-1],) , and t should not be returned. 如果元组t长度小于max_depth且不以mylist[-1]结尾,则t是先前返回的元组t + (mylist[-1],)的前缀,并且不应返回t

Thus, the combinations you should generate are exactly the ones of length max_depth and the shorter ones that end with mylist[-1] . 因此,您应生成的组合恰好是长度为max_depth ,以及较短的以mylist[-1]结尾的组合。 The following code does so, in exactly the same order as your original code, and correctly handling cases like maxdepth > len(mylist) : 以下代码以与原始代码完全相同的顺序执行此操作,并正确处理了诸如maxdepth > len(mylist)

def nonprefix_combinations(iterable, maxlen):
    iterable = list(iterable)
    if not (iterable and maxlen):
        return
    for comb in combinations(iterable, maxlen):
        yield comb
    for length in xrange(maxlen-2, -1, -1):
        for comb in combinations(iterable[:-1], length):
            yield comb + (iterable[-1],)

(I've assumed here that in the case where maxdepth == 0 , you still don't want to include the empty tuple in your output, even though for maxdepth == 0 , it isn't a prefix of a previously-returned tuple. If you do want the empty tuple in this case, you can change if not (iterable and maxlen) to if not iterable .) (我在这里假设在maxdepth == 0的情况下,您仍然不希望在输出中包含空元组,即使对于maxdepth == 0 ,它也不是先前返回的前缀如果在这种情况下确实想要空元组,则可以将( if not (iterable and maxlen)更改为if not iterable 。)

I noticed an interesting pattern in your desired output and I have a generator that produces that. 我在您想要的输出中发现了一个有趣的模式,并且有一个生成器可以生成该模式。 Does this work for all your cases? 这对您所有情况都有效吗?

from itertools import combinations

def orderedSetCombination(iterable, r):
    # Get the last element of the iterable
    last = (iterable[-1], )
    # yield all the combinations of the iterable without the
    # last element
    for iter in combinations(iterable[:-1], r):
        yield iter
    # while r > 1 reduce r by 1 and yield all the combinations
    while r>1:
        r -= 1
        for iter in combinations(iterable[:-1], r):
            yield iter+last
    # yield the last item
    yield last

iter = [0,1,2,3]

for el in (list(orderedSetCombination(iter, 3))):
    print(el)

Here is my explaination of the logic: 这是我对逻辑的解释:

# All combinations that does not include the last element of the iterable
# taking r = max_depth items at a time

(0,1,2) 

# from here on, its the combinations of all the elements except 
# the last element and the last element is added to it.
# so here taking r = r -1 items at a time and adding the last element
# combinations([0,1,2], r=2)

(0,1,3)
(0,2,3)
(1,2,3)

# the only possible value right now at index r = 2 is the last element (3)
# since all possible values of (0,1,_) (0,2,_) (1,2,_) are already listed
# So reduce r by 1 again and continue: combinations([0,1,2], r=1)

(0, 3)
(1, 3)
(2, 3)

# continue until r == 0 and then yield the last element

(3,)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM