[英]Python: Produce all possible sequence combination from a list with character limit
My question is exactly the same as this question . 我的问题与这个问题完全相同。 I have the array (list) of characters. 我有字符的数组(列表)。 I would like to get all possible sequence combinations from that list but with the limit of characters (for example: 2 characters as maximum). 我想从该列表中获取所有可能的序列组合,但具有字符限制 (例如:最多2个字符)。 Further, no single character can be repeated in a permutation line: 此外,在排列行中不能重复单个字符:
chars = ['a', 'b', 'c', 'd']
# output
output = [['a', 'b', 'c', 'd'],
['ab', 'c', 'd'],
['a', 'bc', 'd'],
['a', 'b', 'cd'],
['ab', 'cd'],
['abc', 'd'], # this one will be exempted
['a', 'bcd'], # this one will be exempted
['abcd']] # this one will be exempted
I know I can check the condition to omit the over-limit characters combinations while generating and building the sequence. 我知道我可以在生成和构建序列时检查条件以省略超限字符组合 。 But it will add the run time. 但它会增加运行时间。 My purpose is to reduce the existing execution time . 我的目的是减少现有的执行时间 。
Without the character count limitation , the combinations will be generated like 2^(N-1). 没有字符数限制 ,组合将生成为2 ^(N-1)。 If the list is over 15 characters, it will take too long to execute the program. 如果列表超过15个字符,则执行程序将花费很长时间。 Therefore I would like to reduce the combinations count by character limit. 因此,我想减少字符数限制的组合数。
The priority is performance. 优先考虑的是表现。 I already research and tried for two days without any success. 我已经研究并试了两天没有任何成功。
One way to do it is to iterate over the input list and gradually build up the combinations. 一种方法是迭代输入列表并逐步构建组合。 In each step, the next character is taken from the input list and added to the previously generated combinations. 在每个步骤中,从输入列表中取出下一个字符并将其添加到先前生成的组合中。
from collections import defaultdict
def make_combinations(seq, maxlen):
# memo is a dict of {length_of_last_word: list_of_combinations}
memo = defaultdict(list)
memo[1] = [[seq[0]]] # put the first character into the memo
seq_iter = iter(seq)
next(seq_iter) # skip the first character
for char in seq_iter:
new_memo = defaultdict(list)
# iterate over the memo and expand it
for wordlen, combos in memo.items():
# add the current character as a separate word
new_memo[1].extend(combo + [char] for combo in combos)
# if the maximum word length isn't reached yet, add a character to the last word
if wordlen < maxlen:
word = combos[0][-1] + char
new_memo[wordlen+1] = newcombos = []
for combo in combos:
combo[-1] = word # overwrite the last word with a longer one
newcombos.append(combo)
memo = new_memo
# flatten the memo into a list and return it
return [combo for combos in memo.values() for combo in combos]
Output: 输出:
[['a', 'b', 'c', 'd'], ['ab', 'c', 'd'], ['a', 'bc', 'd'],
['a', 'b', 'cd'], ['ab', 'cd']]
This implementation is slower than the naive itertools.product
approach for short inputs: 对于短输入,此实现比天真的itertools.product
方法慢:
input: a b c d
maxlen: 2
iterations: 10000
itertools.product: 0.11653625800136069 seconds
make_combinations: 0.16573870600041118 seconds
But it picks up quickly when the input list is longer: 但是当输入列表更长时,它会快速恢复:
input: a b c d e f g h i j k
maxlen: 2
iterations: 10000
itertools.product: 6.9087735799985240 seconds
make_combinations: 1.2037671390007745 seconds
Generally, it is easier to produce a large combination/permutation list and then filter the results to achieve the desired output. 通常,更容易生成大的组合/置换列表,然后过滤结果以实现所需的输出。 You can use a recursive generator function to get the combinations and then filter and join the results: 您可以使用递归生成器函数来获取组合,然后过滤并加入结果:
chars = ['a', 'b', 'c', 'd']
def get_combos(c):
if len(c) == 1:
yield c
else:
yield c
for i in range(len(c)-1):
yield from get_combos([c[d]+c[d+1] if d == i else c[d] if d < i else c[d+1] for d in range(len(c)-1)])
final_listing = list(get_combos(chars))
last_results = list(filter(lambda x:all(len(c) < 3 for c in x), [a for i, a in enumerate(final_listing) if a not in final_listing[:i]]))
Output: 输出:
[['a', 'b', 'c', 'd'], ['ab', 'c', 'd'], ['ab', 'cd'], ['a', 'bc', 'd'], ['a', 'b', 'cd']]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.