简体   繁体   English

具有唯一值的排列

[英]permutations with unique values

itertools.permutations generates where its elements are treated as unique based on their position, not on their value. itertools.permutations 根据 position 而非其值生成其元素被视为唯一的位置。 So basically I want to avoid duplicates like this:所以基本上我想避免这样的重复:

>>> list(itertools.permutations([1, 1, 1]))
[(1, 1, 1), (1, 1, 1), (1, 1, 1), (1, 1, 1), (1, 1, 1), (1, 1, 1)]

Filtering afterwards is not possible because the amount of permutations is too large in my case.事后过滤是不可能的,因为在我的情况下排列的数量太大。

Does anybody know of a suitable algorithm for this?有人知道合适的算法吗?

Thank you very much!非常感谢!

EDIT:编辑:

What I basically want is the following:我基本上想要的是以下内容:

x = itertools.product((0, 1, 'x'), repeat=X)
x = sorted(x, key=functools.partial(count_elements, elem='x'))

which is not possible because sorted creates a list and the output of itertools.product is too large.这是不可能的,因为sorted创建了一个列表,并且 itertools.product 的 output 太大。

Sorry, I should have described the actual problem.对不起,我应该描述实际问题。

class unique_element:
    def __init__(self,value,occurrences):
        self.value = value
        self.occurrences = occurrences

def perm_unique(elements):
    eset=set(elements)
    listunique = [unique_element(i,elements.count(i)) for i in eset]
    u=len(elements)
    return perm_unique_helper(listunique,[0]*u,u-1)

def perm_unique_helper(listunique,result_list,d):
    if d < 0:
        yield tuple(result_list)
    else:
        for i in listunique:
            if i.occurrences > 0:
                result_list[d]=i.value
                i.occurrences-=1
                for g in  perm_unique_helper(listunique,result_list,d-1):
                    yield g
                i.occurrences+=1




a = list(perm_unique([1,1,2]))
print(a)

result:结果:

[(2, 1, 1), (1, 2, 1), (1, 1, 2)]

EDIT (how this works):编辑(这是如何工作的):

I rewrote the above program to be longer but more readable.我重写了上面的程序,使其更长但更具可读性。

I usually have a hard time explaining how something works, but let me try.我通常很难解释某件事是如何工作的,但让我试试。 In order to understand how this works, you have to understand a similar but simpler program that would yield all permutations with repetitions.为了理解它是如何工作的,你必须理解一个类似但更简单的程序,它会产生所有重复的排列。

def permutations_with_replacement(elements,n):
    return permutations_helper(elements,[0]*n,n-1)#this is generator

def permutations_helper(elements,result_list,d):
    if d<0:
        yield tuple(result_list)
    else:
        for i in elements:
            result_list[d]=i
            all_permutations = permutations_helper(elements,result_list,d-1)#this is generator
            for g in all_permutations:
                yield g

This program is obviously much simpler: d stands for depth in permutations_helper and has two functions.这个程序显然要简单得多:d 代表 permutations_helper 中的深度,有两个函数。 One function is the stopping condition of our recursive algorithm, and the other is for the result list that is passed around.一个 function 是我们递归算法的停止条件,另一个是用于传递的结果列表。

Instead of returning each result, we yield it.我们没有返回每个结果,而是产生它。 If there were no function/operator yield we would have to push the result in some queue at the point of the stopping condition.如果没有函数/运算符yield ,我们将不得不在停止条件点将结果推送到某个队列中。 But this way, once the stopping condition is met, the result is propagated through all stacks up to the caller.但是这样,一旦满足停止条件,结果就会通过所有堆栈传播到调用者。 That is the purpose of这就是目的
for g in perm_unique_helper(listunique,result_list,d-1): yield g so each result is propagated up to caller. for g in perm_unique_helper(listunique,result_list,d-1): yield g所以每个结果都会传播给调用者。

Back to the original program: we have a list of unique elements.回到原来的程序:我们有一个独特元素的列表。 Before we can use each element, we have to check how many of them are still available to push onto result_list.在我们可以使用每个元素之前,我们必须检查其中有多少仍然可以推送到 result_list。 Working with this program is very similar to permutations_with_replacement .使用这个程序与permutations_with_replacement非常相似。 The difference is that each element cannot be repeated more times than it is in perm_unique_helper.不同之处在于每个元素的重复次数不能超过 perm_unique_helper 中的次数。

Because sometimes new questions are marked as duplicates and their authors are referred to this question it may be important to mention that sympy has an iterator for this purpose.因为有时新问题被标记为重复问题并且他们的作者被提及这个问题可能很重要的是要提到sympy有一个用于此目的的迭代器。

>>> from sympy.utilities.iterables import multiset_permutations
>>> list(multiset_permutations([1,1,1]))
[[1, 1, 1]]
>>> list(multiset_permutations([1,1,2]))
[[1, 1, 2], [1, 2, 1], [2, 1, 1]]

This relies on the implementation detail that any permutation of a sorted iterable are in sorted order unless they are duplicates of prior permutations.这依赖于实现细节,即排序迭代的任何排列都是按排序顺序排列的,除非它们是先前排列的重复。

from itertools import permutations

def unique_permutations(iterable, r=None):
    previous = tuple()
    for p in permutations(sorted(iterable), r):
        if p > previous:
            previous = p
            yield p

for p in unique_permutations('cabcab', 2):
    print p

gives

('a', 'a')
('a', 'b')
('a', 'c')
('b', 'a')
('b', 'b')
('b', 'c')
('c', 'a')
('c', 'b')
('c', 'c')

Roughly as fast as Luka Rahne's answer, but shorter & simpler, IMHO.恕我直言,与 Luka Rahne 的回答大致一样快,但更短更简单。

def unique_permutations(elements):
    if len(elements) == 1:
        yield (elements[0],)
    else:
        unique_elements = set(elements)
        for first_element in unique_elements:
            remaining_elements = list(elements)
            remaining_elements.remove(first_element)
            for sub_permutation in unique_permutations(remaining_elements):
                yield (first_element,) + sub_permutation

>>> list(unique_permutations((1,2,3,1)))
[(1, 1, 2, 3), (1, 1, 3, 2), (1, 2, 1, 3), ... , (3, 1, 2, 1), (3, 2, 1, 1)]

It works recursively by setting the first element (iterating through all unique elements), and iterating through the permutations for all remaining elements.它通过设置第一个元素(遍历所有唯一元素)并遍历所有剩余元素的排列来递归地工作。

Let's go through the unique_permutations of (1,2,3,1) to see how it works:让我们通过 (1,2,3,1) 的unique_permutations来 go 看看它是如何工作的:

  • unique_elements are 1,2,3 unique_elements是 1,2,3
  • Let's iterate through them: first_element starts with 1.让我们遍历它们: first_element从 1 开始。
    • remaining_elements are [2,3,1] (ie. 1,2,3,1 minus the first 1) remaining_elements元素是 [2,3,1] (即 1,2,3,1 减去第一个 1)
    • We iterate (recursively) through the permutations of the remaining elements: (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)我们(递归地)遍历剩余元素的排列:(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)
    • For each sub_permutation , we insert the first_element : ( 1 ,1,2,3), ( 1 ,1,3,2), ... and yield the result.对于每个sub_permutation ,我们插入first_element : ( 1 ,1,2,3), ( 1 ,1,3,2), ... 并产生结果。
  • Now we iterate to first_element = 2, and do the same as above.现在我们迭代到first_element = 2,并执行与上面相同的操作。
    • remaining_elements are [1,3,1] (ie. 1,2,3,1 minus the first 2) remaining_elements元素是 [1,3,1] (即 1,2,3,1 减去前 2 个)
    • We iterate through the permutations of the remaining elements: (1, 1, 3), (1, 3, 1), (3, 1, 1)我们遍历剩余元素的排列:(1, 1, 3), (1, 3, 1), (3, 1, 1)
    • For each sub_permutation , we insert the first_element : ( 2 , 1, 1, 3), ( 2 , 1, 3, 1), ( 2 , 3, 1, 1)... and yield the result.对于每个sub_permutation ,我们插入first_element : ( 2 , 1, 1, 3), ( 2 , 1, 3, 1), ( 2 , 3, 1, 1)... 并产生结果。
  • Finally, we do the same with first_element = 3.最后,我们对first_element = 3 做同样的事情。

You could try using set:您可以尝试使用 set:

>>> list(itertools.permutations(set([1,1,2,2])))
[(1, 2), (2, 1)]

The call to set removed duplicates设置已删除重复项的调用

A naive approach might be to take the set of the permutations:一种天真的方法可能是采用一组排列:

list(set(it.permutations([1, 1, 1])))
# [(1, 1, 1)]

However, this technique wastefully computes replicate permutations and discards them.然而,这种技术浪费了计算重复排列并丢弃它们。 A more efficient approach would be more_itertools.distinct_permutations , a third-party tool .更有效的方法是more_itertools.distinct_permutations ,一个第三方工具

Code代码

import itertools as it

import more_itertools as mit


list(mit.distinct_permutations([1, 1, 1]))
# [(1, 1, 1)]

Performance表现

Using a larger iterable, we will compare the performances between the naive and third-party techniques.使用更大的迭代器,我们将比较幼稚技术和第三方技术之间的性能。

iterable = [1, 1, 1, 1, 1, 1]
len(list(it.permutations(iterable)))
# 720

%timeit -n 10000 list(set(it.permutations(iterable)))
# 10000 loops, best of 3: 111 µs per loop

%timeit -n 10000 list(mit.distinct_permutations(iterable))
# 10000 loops, best of 3: 16.7 µs per loop

We see more_itertools.distinct_permutations is an order of magnitude faster.我们看到more_itertools.distinct_permutations快了一个数量级。


Details细节

From the source, a recursion algorithm (as seen in the accepted answer) is used to compute distinct permutations, thereby obviating wasteful computations.从源头上看,递归算法(如已接受的答案所示)用于计算不同的排列,从而避免了浪费的计算。 See the source code for more details.有关更多详细信息,请参阅源代码

This is my solution with 10 lines:这是我的 10 行解决方案:

class Solution(object):
    def permute_unique(self, nums):
        perms = [[]]
        for n in nums:
            new_perm = []
            for perm in perms:
                for i in range(len(perm) + 1):
                    new_perm.append(perm[:i] + [n] + perm[i:])
                    # handle duplication
                    if i < len(perm) and perm[i] == n: break
            perms = new_perm
        return perms


if __name__ == '__main__':
    s = Solution()
    print s.permute_unique([1, 1, 1])
    print s.permute_unique([1, 2, 1])
    print s.permute_unique([1, 2, 3])

--- Result ---- - - 结果 - -

[[1, 1, 1]]
[[1, 2, 1], [2, 1, 1], [1, 1, 2]]
[[3, 2, 1], [2, 3, 1], [2, 1, 3], [3, 1, 2], [1, 3, 2], [1, 2, 3]]

Here is a recursive solution to the problem.这是该问题的递归解决方案。

def permutation(num_array):
    res=[]
    if len(num_array) <= 1:
        return [num_array]
    for num in set(num_array):
        temp_array = num_array.copy()
        temp_array.remove(num)
        res += [[num] + perm for perm in permutation(temp_array)]
    return res

arr=[1,2,2]
print(permutation(arr))

To generate unique permutations of ["A","B","C","D"] I use the following:为了生成["A","B","C","D"]的唯一排列,我使用以下命令:

from itertools import combinations,chain

l = ["A","B","C","D"]
combs = (combinations(l, r) for r in range(1, len(l) + 1))
list_combinations = list(chain.from_iterable(combs))

Which generates:生成:

[('A',),
 ('B',),
 ('C',),
 ('D',),
 ('A', 'B'),
 ('A', 'C'),
 ('A', 'D'),
 ('B', 'C'),
 ('B', 'D'),
 ('C', 'D'),
 ('A', 'B', 'C'),
 ('A', 'B', 'D'),
 ('A', 'C', 'D'),
 ('B', 'C', 'D'),
 ('A', 'B', 'C', 'D')]

Notice, duplicates are not created (eg items in combination with D are not generated, as they already exist).请注意,不会创建重复项(例如,不会生成与D组合的项目,因为它们已经存在)。

Example: This can then be used in generating terms of higher or lower order for OLS models via data in a Pandas dataframe.示例:这可用于通过 Pandas dataframe 中的数据为 OLS 模型生成高阶或低阶项。

import statsmodels.formula.api as smf
import pandas as pd

# create some data
pd_dataframe = pd.Dataframe(somedata)
response_column = "Y"

# generate combinations of column/variable names
l = [col for col in pd_dataframe.columns if col!=response_column]
combs = (combinations(l, r) for r in range(1, len(l) + 1))
list_combinations = list(chain.from_iterable(combs))

# generate OLS input string
formula_base = '{} ~ '.format(response_column)
list_for_ols = [":".join(list(item)) for item in list_combinations]
string_for_ols = formula_base + ' + '.join(list_for_ols)

Creates...创建...

Y ~ A + B + C + D + A:B + A:C + A:D + B:C + B:D + C:D + A:B:C + A:B:D + A:C:D + B:C:D + A:B:C:D'

Which can then be piped to your OLS regression然后可以将其传输到您的OLS 回归

model = smf.ols(string_for_ols, pd_dataframe).fit()
model.summary()

The best solution to this problem I have seen uses Knuth's "Algorithm L" (as noted previously by Gerrat in the comments to the original post):我见过的这个问题的最佳解决方案是使用 Knuth 的“算法 L”(正如 Gerrat 之前在对原始帖子的评论中指出的那样):
http://stackoverflow.com/questions/12836385/how-can-i-interleave-or-create-unique-permutations-of-two-stings-without-recurs/12837695 http://stackoverflow.com/questions/12836385/how-can-i-interleave-or-create-unique-permutations-of-two-stings-without-recurs/12837695

Some timings:一些时间:

Sorting [1]*12+[0]*12 (2,704,156 unique permutations):排序[1]*12+[0]*12 (2,704,156 个唯一排列):
Algorithm L → 2.43 s算法 L → 2.43 s
Luke Rahne's solution → 8.56 s Luke Rahne 的解决方案 → 8.56 s
scipy.multiset_permutations() → 16.8 s scipy.multiset_permutations() → 16.8 秒

It sound like you are looking for itertools.combinations() docs.python.org听起来您正在寻找 itertools.combinations() docs.python.org

list(itertools.combinations([1, 1, 1],3))
[(1, 1, 1)]

Bumped into this question while looking for something myself !自己找东西的时候碰到这个问题!

Here's what I did:这是我所做的:

def dont_repeat(x=[0,1,1,2]): # Pass a list
    from itertools import permutations as per
    uniq_set = set()
    for byt_grp in per(x, 4):
        if byt_grp not in uniq_set:
            yield byt_grp
            uniq_set.update([byt_grp])
    print uniq_set

for i in dont_repeat(): print i
(0, 1, 1, 2)
(0, 1, 2, 1)
(0, 2, 1, 1)
(1, 0, 1, 2)
(1, 0, 2, 1)
(1, 1, 0, 2)
(1, 1, 2, 0)
(1, 2, 0, 1)
(1, 2, 1, 0)
(2, 0, 1, 1)
(2, 1, 0, 1)
(2, 1, 1, 0)
set([(0, 1, 1, 2), (1, 0, 1, 2), (2, 1, 0, 1), (1, 2, 0, 1), (0, 1, 2, 1), (0, 2, 1, 1), (1, 1, 2, 0), (1, 2, 1, 0), (2, 1, 1, 0), (1, 0, 2, 1), (2, 0, 1, 1), (1, 1, 0, 2)])

Basically, make a set and keep adding to it.基本上,制作一组并不断添加。 Better than making lists etc. that take too much memory.. Hope it helps the next person looking out:-) Comment out the set 'update' in the function to see the difference.比制作需要太多 memory 的列表等要好。希望它有助于下一个人注意:-) 注释掉 function 中的设置“更新”以查看差异。

You can make a function that uses collections.Counter to get unique items and their counts from the given sequence, and uses itertools.combinations to pick combinations of indices for each unique item in each recursive call, and map the indices back to a list when all indices are picked:您可以制作一个 function ,它使用collections.Counter从给定序列中获取唯一项目及其计数,并使用itertools.combinations为每个递归调用中的每个唯一项目选择索引组合,并使用 Z1D78DC5 返回 a142144E5018选择所有索引:

from collections import Counter
from itertools import combinations
def unique_permutations(seq):
    def index_permutations(counts, index_pool):
        if not counts:
            yield {}
            return
        (item, count), *rest = counts.items()
        rest = dict(rest)
        for indices in combinations(index_pool, count):
            mapping = dict.fromkeys(indices, item)
            for others in index_permutations(rest, index_pool.difference(indices)):
                yield {**mapping, **others}
    indices = set(range(len(seq)))
    for mapping in index_permutations(Counter(seq), indices):
        yield [mapping[i] for i in indices]

so that [''.join(i) for i in unique_permutations('moon')] returns:这样[''.join(i) for i in unique_permutations('moon')]返回:

['moon', 'mono', 'mnoo', 'omon', 'omno', 'nmoo', 'oomn', 'onmo', 'nomo', 'oonm', 'onom', 'noom']

This is my attempt without resorting to set / dict, as a generator using recursion, but using string as input.这是我的尝试,不使用 set / dict 作为使用递归的生成器,而是使用字符串作为输入。 Output is also ordered in natural order: Output 也按自然顺序排列:

def perm_helper(head: str, tail: str):
    if len(tail) == 0:
        yield head
    else:
        last_c = None
        for index, c in enumerate(tail):
            if last_c != c:
                last_c = c
                yield from perm_helper(
                    head + c, tail[:index] + tail[index + 1:]
                )


def perm_generator(word):
    yield from perm_helper("", sorted(word))

example:例子:

from itertools import takewhile
word = "POOL"
list(takewhile(lambda w: w != word, (x for x in perm_generator(word))))
# output
# ['LOOP', 'LOPO', 'LPOO', 'OLOP', 'OLPO', 'OOLP', 'OOPL', 'OPLO', 'OPOL', 'PLOO', 'POLO']
ans=[]
def fn(a, size): 
    if (size == 1): 
        if a.copy() not in ans:
            ans.append(a.copy())
            return

    for i in range(size): 
        fn(a,size-1); 
        if size&1: 
            a[0], a[size-1] = a[size-1],a[0] 
        else: 
            a[i], a[size-1] = a[size-1],a[i]

https://www.geeksforgeeks.org/heaps-algorithm-for-generating-permutations/ https://www.geeksforgeeks.org/heaps-algorithm-for-generating-permutations/

Came across this the other day while working on a problem of my own.前几天在解决我自己的问题时遇到了这个问题。 I like Luka Rahne's approach, but I thought that using the Counter class in the collections library seemed like a modest improvement.我喜欢 Luka Rahne 的方法,但我认为在 collections 库中使用计数器 class 似乎是一个适度的改进。 Here's my code:这是我的代码:

def unique_permutations(elements):
    "Returns a list of lists; each sublist is a unique permutations of elements."
    ctr = collections.Counter(elements)

    # Base case with one element: just return the element
    if len(ctr.keys())==1 and ctr[ctr.keys()[0]] == 1:
        return [[ctr.keys()[0]]]

    perms = []

    # For each counter key, find the unique permutations of the set with
    # one member of that key removed, and append the key to the front of
    # each of those permutations.
    for k in ctr.keys():
        ctr_k = ctr.copy()
        ctr_k[k] -= 1
        if ctr_k[k]==0: 
            ctr_k.pop(k)
        perms_k = [[k] + p for p in unique_permutations(ctr_k)]
        perms.extend(perms_k)

    return perms

This code returns each permutation as a list.此代码将每个排列作为列表返回。 If you feed it a string, it'll give you a list of permutations where each one is a list of characters.如果你给它一个字符串,它会给你一个排列列表,其中每个排列都是一个字符列表。 If you want the output as a list of strings instead (for example, if you're a terrible person and you want to abuse my code to help you cheat in Scrabble), just do the following:如果您希望将 output 用作字符串列表(例如,如果您是一个可怕的人,并且想滥用我的代码来帮助您在 Scrabble 中作弊),只需执行以下操作:

[''.join(perm) for perm in unique_permutations('abunchofletters')]

I came up with a very suitable implementation using itertools.product in this case (this is an implementation where you want all combinations在这种情况下,我使用 itertools.product 提出了一个非常合适的实现(这是一个您想要所有组合的实现

unique_perm_list = [''.join(p) for p in itertools.product(['0', '1'], repeat = X) if ''.join(p).count() == somenumber]

this is essentially a combination (n over k) with n = X and somenumber = k itertools.product() iterates from k = 0 to k = X subsequent filtering with count ensures that just the permutations with the right number of ones are cast into a list.这本质上是 n = X 和 somenumber = k itertools.product() 从 k = 0 迭代到 k = X 的组合 (n over k),随后使用 count 过滤确保仅将具有正确数量的排列转换为一个列表。 you can easily see that it works when you calculate n over k and compare it to the len(unique_perm_list)当您计算 n over k 并将其与 len(unique_perm_list) 进行比较时,您可以很容易地看到它有效

Adapted to remove recursion, use a dictionary and numba for high performance but not using yield/generator style so memory usage is not limited:适应删除递归,使用字典和 numba 以获得高性能,但不使用 yield/generator 样式,因此 memory 的使用不受限制:

import numba

@numba.njit
def perm_unique_fast(elements): #memory usage too high for large permutations
    eset = set(elements)
    dictunique = dict()
    for i in eset: dictunique[i] = elements.count(i)
    result_list = numba.typed.List()
    u = len(elements)
    for _ in range(u): result_list.append(0)
    s = numba.typed.List()
    results = numba.typed.List()
    d = u
    while True:
        if d > 0:
            for i in dictunique:
                if dictunique[i] > 0: s.append((i, d - 1))
        i, d = s.pop()
        if d == -1:
            dictunique[i] += 1
            if len(s) == 0: break
            continue
        result_list[d] = i
        if d == 0: results.append(result_list[:])
        dictunique[i] -= 1
        s.append((i, -1))
    return results
import timeit
l = [2, 2, 3, 3, 4, 4, 5, 5, 6, 6]
%timeit list(perm_unique(l))
#377 ms ± 26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

ltyp = numba.typed.List()
for x in l: ltyp.append(x)
%timeit perm_unique_fast(ltyp)
#293 ms ± 3.37 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

assert list(sorted(perm_unique(l))) == list(sorted([tuple(x) for x in perm_unique_fast(ltyp)]))

About 30% faster but still suffers a bit due to list copying and management.大约快 30%,但由于列表复制和管理,仍然会受到一些影响。

Alternatively without numba but still without recursion and using a generator to avoid memory issues:或者没有 numba 但仍然没有递归并使用生成器来避免 memory 问题:

def perm_unique_fast_gen(elements):
    eset = set(elements)
    dictunique = dict()
    for i in eset: dictunique[i] = elements.count(i)
    result_list = list() #numba.typed.List()
    u = len(elements)
    for _ in range(u): result_list.append(0)
    s = list()
    d = u
    while True:
        if d > 0:
            for i in dictunique:
                if dictunique[i] > 0: s.append((i, d - 1))
        i, d = s.pop()
        if d == -1:
            dictunique[i] += 1
            if len(s) == 0: break
            continue
        result_list[d] = i
        if d == 0: yield result_list
        dictunique[i] -= 1
        s.append((i, -1))

May be we can use set here to obtain unique permutations也许我们可以在这里使用 set 来获得唯一的排列

import itertools
print('unique perms >> ', set(itertools.permutations(A)))

What about关于什么

np.unique(itertools.permutations([1, 1, 1]))

The problem is the permutations are now rows of a Numpy array, thus using more memory, but you can cycle through them as before问题是排列现在是 Numpy 数组的行,因此使用更多 memory,但您可以像以前一样循环遍历它们

perms = np.unique(itertools.permutations([1, 1, 1]))
for p in perms:
    print p

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM