简体   繁体   English

没有两个连续相等元素的重复排列

[英]Permutations with repetition without two consecutive equal elements

I need a function that generates all the permutation with repetition of an iterable with the clause that two consecutive elements must be different;我需要一个函数,该函数通过重复一个可迭代对象来生成所有排列,并带有两个连续元素必须不同的子句; for example例如

f([0,1],3).sort()==[(0,1,0),(1,0,1)]
#or
f([0,1],3).sort()==[[0,1,0],[1,0,1]]
#I don't need the elements in the list to be sorted.
#the elements of the return can be tuples or lists, it doesn't change anything

Unfortunatly itertools.permutation doesn't work for what I need (each element in the iterable is present once or no times in the return)不幸的是,itertools.permutation 不能满足我的需要(迭代中的每个元素在返回中出现一次或不出现)

I've tried a bunch of definitions;我试过很多定义; first, filterting elements from itertools.product(iterable,repeat=r) input, but is too slow for what I need.首先,过滤来自 itertools.product(iterable,repeat=r) 输入的元素,但对于我需要的来说太慢了。

from itertools import product
def crp0(iterable,r):
l=[]
for f in product(iterable,repeat=r):
    #print(f)
    b=True
    last=None #supposing no element of the iterable is None, which is fine for me
    for element in f:
        if element==last:
            b=False
            break
        last=element
    if b: l.append(f)
return l

Second, I tried to build r for cycle, one inside the other (where r is the class of the permutation, represented as k in math).其次,我尝试为循环构建 r,一个在另一个循环中(其中 r 是排列的类,在数学中表示为 k)。

def crp2(iterable,r):
    a=list(range(0,r))
    s="\n"
    tab="    " #4 spaces
    l=[]
    for i in a:
        s+=(2*i*tab+"for a["+str(i)+"] in iterable:\n"+
        (2*i+1)*tab+"if "+str(i)+"==0 or a["+str(i)+"]!=a["+str(i-1)+"]:\n")
    s+=(2*i+2)*tab+"l.append(a.copy())"
    exec(s)
    return l

I know, there's no need you remember me: exec is ugly, exec can be dangerous, exec isn't easy-readable... I know.我知道,您不需要记住我:exec 很丑陋,exec 可能很危险,exec 不容易阅读……我知道。 To understand better the function I suggest you to replace exec(s) with print(s).为了更好地理解该功能,我建议您将 exec(s) 替换为 print(s)。

I give you an example of what string is inside the exec for crp([0,1],2):我给你一个例子,说明 crp([0,1],2) 的 exec 中有什么字符串:

for a[0] in iterable:
    if 0==0 or a[0]!=a[-1]:
        for a[1] in iterable:
            if 1==0 or a[1]!=a[0]:
                l.append(a.copy())

But, apart from using exec, I need a better functions because crp2 is still too slow (even if faster than crp0);但是,除了使用 exec 之外,我需要更好的功能,因为 crp2 仍然太慢(即使比 crp0 快); there's any way to recreate the code with r for without using exec?有什么方法可以在不使用 exec 的情况下使用 r for 重新创建代码? There's any other way to do what I need?还有其他方法可以做我需要的吗?

You could try to return a generator instead of a list.您可以尝试返回生成器而不是列表。 With large values of r , your method will take a very long time to process product(iterable,repeat=r) and will return a huge list.对于较大的r值,您的方法将花费很长时间来处理product(iterable,repeat=r)并将返回一个巨大的列表。

With this variant, you should get the first element very fast:使用此变体,您应该可以非常快速地获取第一个元素:

from itertools import product

def crp0(iterable, r):
    for f in product(iterable, repeat=r):
        last = f[0]
        b = True
        for element in f[1:]:
            if element == last:
                b = False
                break
            last = element
        if b:
            yield f

for no_repetition in crp0([0, 1, 2], 12):
    print(no_repetition)

# (0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1)
# (1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0)

You could prepare the sequences in two halves, then preprocess the second halves to find the compatible choices.您可以准备两半的序列,然后预处理后半部分以找到兼容的选择。

def crp2(I,r):
    r0=r//2
    r1=r-r0
    A=crp0(I,r0) # Prepare first half sequences
    B=crp0(I,r1) # Prepare second half sequences
    D = {} # Dictionary showing compatible second half sequences for each token 
    for i in I:
        D[i] = [b for b in B if b[0]!=i]
    return [a+b for a in A for b in D[a[-1]]]

In a test with iterable=[0,1,2] and r=15, I found this method to be over a hundred times faster than just using crp0.在 iterable=[0,1,2] 和 r=15 的测试中,我发现这种方法比仅使用 crp0 快一百倍以上。

Instead of filtering the elements, you could generate a list directly with only the correct elements.您可以直接生成仅包含正确元素的列表,而不是过滤元素。 This method uses recursion to create the cartesian product:此方法使用递归来创建笛卡尔积:

def product_no_repetition(iterable, r, last_element=None):
    if r == 0:
        return [[]]
    else:
        return [p + [x] for x in iterable
            for p in product_no_repetition(iterable, r - 1, x)
            if x != last_element]

for no_repetition in product_no_repetition([0, 1], 12):
    print(no_repetition)

I agree with @EricDuminil's comment that you do not want "Permutations with repetition."我同意@EricDuminil 的评论,即您不希望“重复排列”。 You want a significant subset of the product of the iterable with itself multiple times.您需要多次与自身迭代的产品的重要子集。 I don't know what name is best: I'll just call them products.我不知道什么名字最好:我就称它们为产品吧。

Here is an approach that builds each product line without building all the products then filtering out the ones you want.这是一种构建每个产品线而不构建所有产品然后筛选出您想要的产品的方法。 My approach is to work primarily with the indices of the iterable rather than the iterable itself--and not all the indices, but ignoring the last one.我的方法是主要使用 iterable 的索引而不是 iterable 本身——而不是所有索引,但忽略最后一个。 So instead of working directly with [2, 3, 5, 7] I work with [0, 1, 2] .因此,我没有直接使用[2, 3, 5, 7]而是使用[0, 1, 2] Then I work with the products of those indices.然后我使用这些指数的产品。 I can transform a product such as [1, 2, 2] where r=3 by comparing each index with the previous one.我可以通过将每个索引与前一个索引进行比较来转换诸如[1, 2, 2]之类的乘积,其中r=3 If an index is greater than or equal to the previous one I increment the current index by one.如果一个索引大于或等于前一个索引,我将当前索引递增 1。 This prevents two indices from being equal, and this also gets be back to using all the indices.这可以防止两个索引相等,并且这也可以返回到使用所有索引。 So [1, 2, 2] is transformed to [1, 2, 3] where the final 2 was changed to a 3 .因此[1, 2, 2]被转换为[1, 2, 3] ,其中最后的2被更改为3 I now use those indices to select the appropriate items from the iterable, so the iterable [2, 3, 5, 7] with r=3 gets the line [3, 5, 7] .我现在使用这些索引从可迭代对象中选择适当的项目,因此带有r=3的可迭代对象[2, 3, 5, 7]得到行[3, 5, 7] The first index is treated differently, since it has no previous index.第一个索引被区别对待,因为它没有以前的索引。 My code is:我的代码是:

from itertools import product

def crp3(iterable, r):
    L = []
    for k in range(len(iterable)):
        for f in product(range(len(iterable)-1), repeat=r-1):
            ndx = k
            a = [iterable[ndx]]
            for j in range(r-1):
                ndx = f[j] if f[j] < ndx else f[j] + 1
                a.append(iterable[ndx])
            L.append(a)
    return L

Using %timeit in my Spyder/IPython configuration on crp3([0,1], 3) shows 8.54 µs per loop while your crp2([0,1], 3) shows 133 µs per loop .crp3([0,1], 3)上的 Spyder/IPython 配置中使用%timeit显示8.54 µs per loopcrp2([0,1], 3)显示133 µs per loop That shows a sizeable speed improvement!这显示了相当大的速度提升! My routine works best where iterable is short and r is large--your routine finds len ** r lines (where len is the length of the iterable) and filters them while mine finds len * (len-1) ** (r-1) lines without filtering.我的例程在iterable很短而r很大的情况下效果最好——你的例程找到len ** r行(其中len是 iterable 的长度)并过滤它们,而我的例程找到len * (len-1) ** (r-1)线路未经过滤。

By the way, your crp2() does do filtering, as shown by the if lines in your code that is exec ed.顺便说一下,您的crp2()确实会进行过滤,如exec代码中的if行所示。 The sole if in my code does not filter a line, it modifies an item in the line.唯一的if在我的代码中不过滤一行,它修改行中的一个项目。 My code does return surprising results if the items in the iterable are not unique: if that is a problem, just change the iterable to a set to remove the duplicates.如果 iterable 中的项目不是唯一的,我的代码确实会返回令人惊讶的结果:如果这是一个问题,只需将 iterable 更改为 set 以删除重复项。 Note that I replaced your l name with L : I think l is too easy to confuse with 1 or I and should be avoided.请注意,我将您的l名称替换为L :我认为l太容易与1I混淆,应该避免使用。 My code could easily be changed to a generator: replace L.append(a) with yield a and remove the lines L = [] and return L .我的代码可以很容易地更改为生成器:将L.append(a)替换为yield a并删除行L = []return L

How about:怎么样:

from itertools import product

result = [ x for x in product(iterable,repeat=r) if all(x[i-1] != x[i] for i in range(1,len(x))) ]

Elaborating on @peter-de-rivaz's idea (divide and conquer).阐述@peter-de-rivaz 的想法(分而治之)。 When you divide the sequence to create into two subsequences, those subsequences are the same or very close.当您将要创建的序列划分为两个子序列时,这些子序列是相同的或非常接近的。 If r = 2*k is even, store the result of crp(k) in a list and merge it with itself.如果r = 2*k是偶数,则将crp(k)的结果存储在列表中并将其与自身合并。 If r=2*k+1 , store the result of crp(k) in a list and merge it with itself and with L .如果r=2*k+1 ,则将crp(k)的结果存储在列表中并将其与自身和L合并。

def large(L, r):
    if r <= 4: # do not end the divide: too slow
        return small(L, r)

    n = r//2
    M = large(L, r//2)
    if r%2 == 0:
        return [x + y for x in M for y in M if x[-1] != y[0]]
    else:
        return [x + y + (e,) for x in M for y in M for e in L if x[-1] != y[0] and y[-1] != e]

small is an adaptation from @eric-duminil's answer using the famous for...else loop of Python : small是对@eric-duminil 的回答的改编,使用了著名for...else Python 循环

from itertools import product

def small(iterable, r):
    for seq in product(iterable, repeat=r):
        prev, *tail = seq
        for e in tail:
            if e == prev:
                break
            prev = e
        else:
            yield seq

A small benchmark:一个小基准:

print(timeit.timeit(lambda: crp2(  [0, 1, 2], 10), number=1000))
#0.16290732200013736
print(timeit.timeit(lambda: crp2(  [0, 1, 2, 3], 15), number=10))
#24.798989593000442

print(timeit.timeit(lambda: large( [0, 1, 2], 10), number=1000))
#0.0071403849997295765
print(timeit.timeit(lambda: large( [0, 1, 2, 3], 15), number=10))
#0.03471425700081454

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM