简体   繁体   English

Python计算唯一列表排列的可能性

[英]Python Calculating Unique List Permutation Possibilities

So I have a problem dealing with permutations of lists/strings, which I am having a hard time solving. 所以我在处理列表/字符串的排列时遇到问题,我很难解决。

So, say I have several Lists: 所以,说我有几个清单:

list1 = ["a"]
list2 = ["a","b","c","d"]
list3 = ["b","e"]
list4 = ["f","g","a"]

I need to calculate the number of all possible combinations of permutations while choosing 1 character from each list. 从每个列表中选择1个字符时,我需要计算排列的所有可能组合的数量。 So, from the first list, I choose a character. 因此,从第一个列表中,我选择一个字符。 "a", in this case since there is only one item in the list. “ a”,在这种情况下,因为列表中只有一个项目。 Next I select an item from the second list, but it CAN'T BE "a", as that was chosen in my previous list, so it could be "b", "c", or "d". 接下来,我从第二个列表中选择一个项目,但是不能像上一个列表中那样选择“ a”,因此它可以是“ b”,“ c”或“ d”。 Next I choose an item from the third list, and if I chose "a" in the first, and "b", in the second, I could only choose "e", as "b" was already used previously. 接下来,我从第三个列表中选择一个项目,如果在第一个列表中选择“ a”,在第二个列表中选择“ b”,则只能选择“ e”,因为先前已经使用过“ b”。 The same goes for the fourth list. 第四个列表也是如此。

So I need to calculate all of the possible combinations of unique character combinations from my lists. 因此,我需要从列表中计算唯一字符组合的所有可能组合。 Hopefully everyone gets what I'm asking here. 希望每个人都明白我在这里要问的问题。 Or if possible, I don't even need to create the lists of permutations, I just need to calculate HOW MANY combinations there are total. 或者,如果可能的话,我什至不需要创建排列列表,我只需要计算总共有多少组合。 Whatever would be less memory intensive as there may be a large number of individual lists in the actual problem 由于实际问题中可能有大量的单个列表,因此无论哪种内存占用较少的方法

To be more verbose with my question... If I had two lists: list1 = ["a"] list2 = ["b"] 要详细说明我的问题...如果我有两个列表:list1 = [“ a”] list2 = [“ b”]

There would only be one combination, as you preserve the location in the permuted strings. 当您将位置保留在置换字符串中时,将只有一种组合。 List one does not contain ab, so the only combination could be ("a","b"), not ("b","a"). 列表一不包含ab,因此唯一的组合可以是(“ a”,“ b”),而不是(“ b”,“ a”)。 And to further extends the constraints of this question .. I don't necessarily want to retrieve the results of all the permutations, I want to only return the TOTAL NUMBER of possible permutations. 为了进一步扩展这个问题的约束,我不一定要检索所有排列的结果,我只想返回可能排列的总数 Returning the results takes up too much memory, as I will be working with rougly fifteen lists, of 1 to 15 characters in each list. 返回结果会占用太多内存,因为我要处理粗糙的15个列表,每个列表中1到15个字符。

Use itertools.product to generate all possible combinations from the lists. 使用itertools.product从列表中生成所有可能的组合。 Then, using itertools.ifilter , filter out all combinations that contain a repeated character. 然后,使用itertools.ifilter过滤掉包含重复字符的所有组合。 One simple way to do this is to check if the length of the list stays the same if you remove all duplicates (ie if you create a set from it). 一种简单的方法是,如果删除所有重复项(即如果从中创建一个集合),则检查列表的长度是否保持不变。

import itertools

list1 = ["a"]
list2 = ["a","b","c","d"]
list3 = ["b","e"]
list4 = ["f","g","a"]

f = lambda x: len(x) == len(set(x))
it = itertools.ifilter(f, itertools.product(list1, list2, list3, list4))

# print all combinations
for combination in it:
    print combination

Use itertools.product. 使用itertools.product。 It iterates through all permutations of choosing one item for each list. 遍历为每个列表选择一项的所有排列。 Additionally, use a list comprehension to eliminate the iterations that don't meet your requirements. 此外,使用列表推导可消除不符合您要求的迭代。

>>> a='a'
>>> b='abcd'
>>> c='be'
>>> d='fga'
>>> import itertools
>>> [a+b+c+d for a,b,c,d in itertools.product(a,b,c,d) if b != a and c not in [a,b] and d not in [a,b,c]]
['abef', 'abeg', 'acbf', 'acbg', 'acef', 'aceg', 'adbf', 'adbg', 'adef', 'adeg']

You can cache counts of the form "starting from the i'th list, excluding elements in S". 您可以缓存形式为“从第i个列表开始,不包括S中的元素”的计数。 By being careful to limit S to only characters that may be excluded (that is, only elements that appear in a later list), you can reduce the amount of repeated computation. 通过小心地将S限制为仅可以排除的字符(即,仅出现在后面的列表中的元素),可以减少重复计算的数量。

Here's an example program: 这是一个示例程序:

def count_uniq_combs(sets, i, excluding, cache):
    if i == len(sets): return 1
    key = (i, excluding)
    if key in cache:
        return cache[key]
    count = 0
    for c in sets[i][0]:
        if c in excluding: continue
        newx = (excluding | set([c])) & sets[i][1]
        count += count_uniq_combs(sets, i + 1, newx, cache)
    cache[key] = count
    print key, count
    return count

def count(xs):
    sets = [[set(x)] for x in xs]
    # Pre-compute the union of all subsequent sets.
    union = set()
    for s in reversed(sets):
        s.append(union)
        union = union | s[0]
    return count_uniq_combs(sets, 0, frozenset(), dict())

print count(['a', 'abcd', 'be', 'fga'])

It prints out the values it's actually calculating (rather than recalling from the cache), which looks like this: 它打印出实际计算的值(而不是从缓存中调用),如下所示:

(3, frozenset(['a'])) 2
(2, frozenset(['a'])) 4
(2, frozenset(['a', 'b'])) 2
(1, frozenset(['a'])) 10
(0, frozenset([])) 10

For example, when looking at list 2 ("b", "e") there's only two counts computed: one where "a" and "b" are both excluded, and one where only "a" is excluded. 例如,当查看列表2(“ b”,“ e”)时,仅计算两个计数:一个计数中“ a”和“ b”都被排除,而一个计数中仅“ a”被排除。 Compare this to the naive implementation where you'd also be counting many other combinations (for example: "a" and "c"). 将此与幼稚的实现进行比较,在该实现中您还需要计算许多其他组合(例如:“ a”和“ c”)。

If still isn't fast enough, you can try heuristics for sorting the lists: you want lists which contain relatively few symbols of other lists to come later. 如果仍然不够快,您可以尝试使用启发式方法对列表进行排序:您希望包含相对较少符号的其他列表的列表稍后出现。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM