简体   繁体   English

生成列表的所有排列而不相邻的相等元素

[英]Generate all permutations of a list without adjacent equal elements

When we sort a list, like 当我们对列表进行排序时,比如

a = [1,2,3,3,2,2,1]
sorted(a) => [1, 1, 2, 2, 2, 3, 3]

equal elements are always adjacent in the resulting list. 等值元素在结果列表中始终相邻。

How can I achieve the opposite task - shuffle the list so that equal elements are never (or as seldom as possible) adjacent? 我怎样才能完成相反的任务 - 对列表进行洗牌,使相邻的元素永远(或尽可能不相邻)相邻?

For example, for the above list one of the possible solutions is 例如,对于上面的列表,可能的解决方案之一是

p = [1,3,2,3,2,1,2]

More formally, given a list a , generate a permutation p of it that minimizes the number of pairs p[i]==p[i+1] . 更正式地,给定列表a ,生成它的排列p ,其最小化对的数量p[i]==p[i+1]

Since the lists are large, generating and filtering all permutations is not an option. 由于列表很大,因此不能生成和过滤所有排列。

Bonus question: how to generate all such permutations efficiently? 奖金问题:如何有效地生成所有这些排列?

This is the code I'm using to test the solutions: https://gist.github.com/gebrkn/9f550094b3d24a35aebd 这是我用来测试解决方案的代码: https//gist.github.com/gebrkn/9f550094b3d24a35aebd

UPD: Choosing a winner here was a tough choice, because many people posted excellent answers. UPD:在这里选择获胜者是一个艰难的选择,因为许多人发布了很好的答案。 @VincentvanderWeele , @David Eisenstat , @Coady , @enrico.bacis and @srgerg provided functions that generate the best possible permutation flawlessly. @VincentvanderWeele@大卫Eisenstat@Coady@ enrico.bacis@srgerg提供函数完美产生的最佳可能的排列。 @tobias_k and David also answered the bonus question (generate all permutations). @tobias_k和大卫也回答了红利问题(生成所有排列)。 Additional points to David for the correctness proof. 大卫的其他要点是正确性证明。

The code from @VincentvanderWeele appears to be the fastest. 来自@VincentvanderWeele的代码似乎是最快的。

This is along the lines of Thijser's currently incomplete pseudocode. 这与Thijser当前不完整的伪代码一致。 The idea is to take the most frequent of the remaining item types unless it was just taken. 除非刚刚采用,否则我们的想法是采用最常见的剩余项目类型。 (See also Coady's implementation of this algorithm.) (另请参阅Coady对此算法的实现 。)

import collections
import heapq


class Sentinel:
    pass


def david_eisenstat(lst):
    counts = collections.Counter(lst)
    heap = [(-count, key) for key, count in counts.items()]
    heapq.heapify(heap)
    output = []
    last = Sentinel()
    while heap:
        minuscount1, key1 = heapq.heappop(heap)
        if key1 != last or not heap:
            last = key1
            minuscount1 += 1
        else:
            minuscount2, key2 = heapq.heappop(heap)
            last = key2
            minuscount2 += 1
            if minuscount2 != 0:
                heapq.heappush(heap, (minuscount2, key2))
        output.append(last)
        if minuscount1 != 0:
            heapq.heappush(heap, (minuscount1, key1))
    return output

Proof of correctness 证明正确

For two item types, with counts k1 and k2, the optimal solution has k2 - k1 - 1 defects if k1 < k2, 0 defects if k1 = k2, and k1 - k2 - 1 defects if k1 > k2. 对于具有计数k1和k2的两个项类型,如果k1 <k2,则最优解具有k2-k1-1个缺陷,如果k1 = k2则为0个缺陷,并且如果k1> k2则为k1-k2-1个缺陷。 The = case is obvious. =的情况很明显。 The others are symmetric; 其他是对称的; each instance of the minority element prevents at most two defects out of a total of k1 + k2 - 1 possible. 少数元素的每个实例最多可以防止总共k1 + k2-1中的两个缺陷。

This greedy algorithm returns optimal solutions, by the following logic. 这种贪心算法通过以下逻辑返回最优解。 We call a prefix (partial solution) safe if it extends to an optimal solution. 如果扩展到最佳解决方案,我们称前缀(部分解决方案)是安全的 Clearly the empty prefix is safe, and if a safe prefix is a whole solution then that solution is optimal. 显然,空前缀是安全的,如果安全前缀是一个完整的解决方案,那么该解决方案是最佳的。 It suffices to show inductively that each greedy step maintains safety. 足以以感应方式显示每个贪婪步骤都能保持安全。

The only way that a greedy step introduces a defect is if only one item type remains, in which case there is only one way to continue, and that way is safe. 贪婪步骤引入缺陷的唯一方法是,如果只剩下一种项目类型,在这种情况下只有一种方法可以继续,这种方式是安全的。 Otherwise, let P be the (safe) prefix just before the step under consideration, let P' be the prefix just after, and let S be an optimal solution extending P. If S extends P' also, then we're done. 否则,让P成为正在考虑的步骤之前的(安全)前缀,让P'成为后面的前缀,并且让S成为扩展P的最优解。如果S也扩展P',那么我们就完成了。 Otherwise, let P' = Px and S = PQ and Q = yQ', where x and y are items and Q and Q' are sequences. 否则,令P'= Px并且S = PQ且Q = yQ',其中x和y是项,Q和Q'是序列。

Suppose first that P does not end with y. 首先假设P不以y结尾。 By the algorithm's choice, x is at least as frequent in Q as y. 通过算法的选择,x在Q中至少与y一样频繁。 Consider the maximal substrings of Q containing only x and y. 考虑仅包含x和y的Q的最大子串。 If the first substring has at least as many x's as y's, then it can be rewritten without introducing additional defects to begin with x. 如果第一个子字符串的y至少与y的x一样多,则可以重写它,而不会引入以x开头的其他缺陷。 If the first substring has more y's than x's, then some other substring has more x's than y's, and we can rewrite these substrings without additional defects so that x goes first. 如果第一个子字符串的y比x更多,那么其他一些子字符串的x比y更多,我们可以重写这些子字符串而不会有其他缺陷,因此x首先出现。 In both cases, we find an optimal solution T that extends P', as needed. 在这两种情况下,我们都会根据需要找到扩展P'的最优解T.

Suppose now that P does end with y. 现在假设P以y结尾。 Modify Q by moving the first occurrence of x to the front. 通过将第一次出现的x移到前面来修改Q. In doing so, we introduce at most one defect (where x used to be) and eliminate one defect (the yy). 在这样做时,我们最多引入一个缺陷(x曾经是x)并消除一个缺陷(yy)。

Generating all solutions 生成所有解决方案

This is tobias_k's answer plus efficient tests to detect when the choice currently under consideration is globally constrained in some way. 这是tobias_k的答案以及有效的测试,以检测当前正在考虑的选择何时以某种方式受到全局约束。 The asymptotic running time is optimal, since the overhead of generation is on the order of the length of the output. 渐近运行时间是最优的,因为生成的开销是输出长度的量级。 The worst-case delay unfortunately is quadratic; 不幸的是,最坏情况的延迟是二次的; it could be reduced to linear (optimal) with better data structures. 它可以通过更好的数据结构简化为线性(最佳)。

from collections import Counter
from itertools import permutations
from operator import itemgetter
from random import randrange


def get_mode(count):
    return max(count.items(), key=itemgetter(1))[0]


def enum2(prefix, x, count, total, mode):
    prefix.append(x)
    count_x = count[x]
    if count_x == 1:
        del count[x]
    else:
        count[x] = count_x - 1
    yield from enum1(prefix, count, total - 1, mode)
    count[x] = count_x
    del prefix[-1]


def enum1(prefix, count, total, mode):
    if total == 0:
        yield tuple(prefix)
        return
    if count[mode] * 2 - 1 >= total and [mode] != prefix[-1:]:
        yield from enum2(prefix, mode, count, total, mode)
    else:
        defect_okay = not prefix or count[prefix[-1]] * 2 > total
        mode = get_mode(count)
        for x in list(count.keys()):
            if defect_okay or [x] != prefix[-1:]:
                yield from enum2(prefix, x, count, total, mode)


def enum(seq):
    count = Counter(seq)
    if count:
        yield from enum1([], count, sum(count.values()), get_mode(count))
    else:
        yield ()


def defects(lst):
    return sum(lst[i - 1] == lst[i] for i in range(1, len(lst)))


def test(lst):
    perms = set(permutations(lst))
    opt = min(map(defects, perms))
    slow = {perm for perm in perms if defects(perm) == opt}
    fast = set(enum(lst))
    print(lst, fast, slow)
    assert slow == fast


for r in range(10000):
    test([randrange(3) for i in range(randrange(6))])

Pseudocode: 伪代码:

  1. Sort the list 对列表进行排序
  2. Loop over the first half of the sorted list and fill all even indices of the result list 循环遍历排序列表的前半部分并填充结果列表的所有偶数索引
  3. Loop over the second half of the sorted list and fill all odd indices of the result list 循环遍历排序列表的后半部分并填充结果列表的所有奇数索引

You will only have p[i]==p[i+1] if more than half of the input consists of the same element, in which case there is no other choice than putting the same element in consecutive spots (by the pidgeon hole principle). 如果超过一半的输入由相同的元素组成,那么你将只有p[i]==p[i+1] ,在这种情况下除了将相同的元素放在连续的点之外别无选择(通过皮江洞原理)。


As pointed out in the comments, this approach may have one conflict too many in case one of the elements occurs at least n/2 times (or n/2+1 for odd n ; this generalizes to (n+1)/2) for both even and odd). 正如评论中所指出的,这种方法可能有一个冲突太多,以防其中一个元素出现至少n/2次(或奇数nn/2+1 ;这推广到(n+1)/2)无论是偶数还是奇数)。 There are at most two such elements and if there are two, the algorithm works just fine. 最多有两个这样的元素,如果有两个,算法运行得很好。 The only problematic case is when there is one element that occurs at least half of the time. 唯一有问题的情况是,有一个元素至少有一半时间出现。 We can simply solve this problem by finding the element and dealing with it first. 我们可以通过找到元素并首先处理它来简单地解决这个问题。

I don't know enough about python to write this properly, so I took the liberty to copy the OP's implementation of a previous version from github: 我不太了解python正确编写这个,所以我冒昧地从github复制OP的先前版本的实现:

# Sort the list
a = sorted(lst)

# Put the element occurring more than half of the times in front (if needed)
n = len(a)
m = (n + 1) // 2
for i in range(n - m + 1):
    if a[i] == a[i + m - 1]:
        a = a[i:] + a[:i]
        break

result = [None] * n

# Loop over the first half of the sorted list and fill all even indices of the result list
for i, elt in enumerate(a[:m]):
    result[2*i] = elt

# Loop over the second half of the sorted list and fill all odd indices of the result list
for i, elt in enumerate(a[m:]):
    result[2*i+1] = elt

return result

The algorithm already given of taking the most common item left that isn't the previous item is correct. 已经给出的最左边的最常见项目的算法是正确的。 Here's a simple implementation, which optimally uses a heap to track the most common. 这是一个简单的实现,它最佳地使用堆来跟踪最常见的。

import collections, heapq
def nonadjacent(keys):
    heap = [(-count, key) for key, count in collections.Counter(a).items()]
    heapq.heapify(heap)
    count, key = 0, None
    while heap:
        count, key = heapq.heapreplace(heap, (count, key)) if count else heapq.heappop(heap)
        yield key
        count += 1
    for index in xrange(-count):
        yield key

>>> a = [1,2,3,3,2,2,1]
>>> list(nonadjacent(a))
[2, 1, 2, 3, 1, 2, 3]

You can generate all the 'perfectly unsorted' permutations (that have no two equal elements in adjacent positions) using a recursive backtracking algorithm. 您可以使用递归回溯算法生成所有 “完全未排序”的排列(在相邻位置中没有两个相等的元素)。 In fact, the only difference to generating all the permutations is that you keep track of the last number and exclude some solutions accordingly: 实际上,生成所有排列的唯一区别是您跟踪最后一个数字并相应地排除一些解决方案:

def unsort(lst, last=None):
    if lst:
        for i, e in enumerate(lst):
            if e != last:
                for perm in unsort(lst[:i] + lst[i+1:], e):
                    yield [e] + perm
    else:
        yield []

Note that in this form the function is not very efficient, as it creates lots of sub-lists. 请注意,在这种形式下,函数效率不高,因为它创建了许多子列表。 Also, we can speed it up by looking at the most-constrained numbers first (those with the highest count). 此外,我们可以通过首先查看最受约束的数字(具有最高计数的数字)来加快速度。 Here's a much more efficient version using only the counts of the numbers. 这是一个更有效的版本,只使用数字的counts

def unsort_generator(lst, sort=False):
    counts = collections.Counter(lst)
    def unsort_inner(remaining, last=None):
        if remaining > 0:
            # most-constrained first, or sorted for pretty-printing?
            items = sorted(counts.items()) if sort else counts.most_common()
            for n, c in items:
                if n != last and c > 0:
                    counts[n] -= 1   # update counts
                    for perm in unsort_inner(remaining - 1, n):
                        yield [n] + perm
                    counts[n] += 1   # revert counts
        else:
            yield []
    return unsort_inner(len(lst))

You can use this to generate just the next perfect permutation, or a list holding all of them. 您可以使用它来生成next完美的排列,或者包含所有排列的list But note, that if there is no perfectly unsorted permutation, then this generator will consequently yield no results. 但请注意,如果没有完全未排序的排列,则此生成器将因此不会产生任何结果。

>>> lst = [1,2,3,3,2,2,1]
>>> next(unsort_generator(lst))
[2, 1, 2, 3, 1, 2, 3]
>>> list(unsort_generator(lst, sort=True))
[[1, 2, 1, 2, 3, 2, 3], 
 ... 36 more ...
 [3, 2, 3, 2, 1, 2, 1]]
>>> next(unsort_generator([1,1,1]))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

To circumvent this problem, you could use this together with one of the algorithms proposed in the other answers as a fallback. 为了避免这个问题,您可以将其与其他答案中提出的算法之一一起用作后备。 This will guarantee to return a perfectly unsorted permutation, if there is one, or a good approximation otherwise. 这将保证返回完全未排序的排列(如果有的话)或其他方面的良好近似。

def unsort_safe(lst):
    try:
        return next(unsort_generator(lst))
    except StopIteration:
        return unsort_fallback(lst)

In python you could do the following. 在python中,您可以执行以下操作。

Consider you have a sorted list l , you can do: 考虑你有一个排序列表l ,你可以这样做:

length = len(l)
odd_ind = length%2
odd_half = (length - odd_ind)/2
for i in range(odd_half)[::2]:
    my_list[i], my_list[odd_half+odd_ind+i] = my_list[odd_half+odd_ind+i], my_list[i]

These are just in place operations and should thus be rather fast ( O(N) ). 这些只是到位操作,因此应该相当快( O(N) )。 Note that you will shift from l[i] == l[i+1] to l[i] == l[i+2] so the order you end up with is anything but random, but from how I understand the question it is not randomness you are looking for. 请注意,您将从l[i] == l[i+1]l[i] == l[i+2]因此您最终得到的顺序不是随机的,而是从我如何理解问题它不是你想要的随机性。

The idea is to split the sorted list in the middle then exchange every other element in the two parts. 我们的想法是在中间拆分排序列表,然后交换这两部分中的每个其他元素。

For l= [1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5] this leads to l = [3, 1, 4, 2, 5, 1, 3, 1, 4, 2, 5] 对于l= [1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5]这导致l = [3, 1, 4, 2, 5, 1, 3, 1, 4, 2, 5] 3,1,4,2,5,1,3,1,4 l= [1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5] l = [3, 1, 4, 2, 5, 1, 3, 1, 4, 2, 5]

The method fails to get rid of all the l[i] == l[i + 1] as soon as the abundance of one element is bigger than or equal to half of the length of the list. 一旦一个元素的丰度大于或等于列表长度的一半,该方法就无法摆脱所有l[i] == l[i + 1]

While the above works fine as long as the abundance of the most frequent element is smaller than half the size of the list, the following function also handles the limit cases (the famous off-by-one issue) where every other element starting with the first one must be the most abundant one: 虽然上述工作正常,只要最频繁元素的丰度小于列表大小的一半, 以下函数也处理极限情况 (着名的逐个问题),其中每个其他元素以第一个必须是最丰富的一个:

def no_adjacent(my_list):
    my_list.sort()
    length = len(my_list)
    odd_ind = length%2
    odd_half = (length - odd_ind)/2
    for i in range(odd_half)[::2]:
        my_list[i], my_list[odd_half+odd_ind+i] = my_list[odd_half+odd_ind+i], my_list[i]

    #this is just for the limit case where the abundance of the most frequent is half of the list length
    if max([my_list.count(val) for val in set(my_list)]) + 1 - odd_ind > odd_half:
        max_val = my_list[0]
        max_count = my_list.count(max_val)
        for val in set(my_list):
            if my_list.count(val) > max_count:
               max_val = val
               max_count = my_list.count(max_val)
        while max_val in my_list:
            my_list.remove(max_val)
        out = [max_val]
        max_count -= 1
        for val in my_list:
            out.append(val)
            if max_count:
                out.append(max_val)
                max_count -= 1
        if max_count:
            print 'this is not working'
            return my_list
            #raise Exception('not possible')
        return out
    else:
        return my_list

Here is a good algorithm: 这是一个很好的算法:

  1. First of all count for all numbers how often they occur. 首先计算所有数字的频率。 Place the answer in a map. 将答案放在地图中。

  2. sort this map so that the numbers that occur most often come first. 对此地图进行排序,以便最常出现的数字首先出现。

  3. The first number of your answer is the first number in the sorted map. 您的答案的第一个数字是有序地图中的第一个数字。

  4. Resort the map with the first now being one smaller. 度假地图的第一个现在是一个较小的。

If you want to improve efficiency look for ways to increase the efficiency of the sorting step. 如果您想提高效率,可以寻找提高分选步骤效率的方法。

In answer to the bonus question: this is an algorithm which finds all permutations of a set where no adjacent elements can be identical. 在回答奖金问题时:这是一种算法,它找到一组中没有相邻元素可以相同的排列。 I believe this to be the most efficient algorithm conceptually (although others may be faster in practice because they translate into simpler code). 我认为这是概念上最有效的算法(尽管其他算法在实践中可能更快,因为它们可以转换为更简单的代码)。 It doesn't use brute force, it only generates unique permutations, and paths not leading to solutions are cut off at the earliest point. 它不使用蛮力,它只产生独特的排列,并且最早点切断不能导致解决方案的路径。

I will use the term "abundant element" for an element in a set which occurs more often than all other elements combined, and the term "abundance" for the number of abundant elements minus the number of other elements. 我将对集合中的元素使用术语“丰富元素”,其比所有其他元素组合更频繁地出现,并且对于丰富元素的数量减去其他元素的数量,术语“丰度”。
eg the set abac has no abundant element, the sets abaca and aabcaa have a as the abundant element, and abundance 1 and 2 respectively. 例如,集合abac没有丰富的元素,所述组abacaaabcaaa作为丰富的元素,并分别丰度1和2。

  1. Start with a set like: 从一组开始:

aaabbcd aaabbcd

  1. Seperate the first occurances from the repeats: 从重复中分离出第一次出现:

firsts: abcd 第一:abcd
repeats: aab 重复:aab

  1. Find the abundant element in the repeats, if any, and calculate the abundance: 如果有的话,在重复中找到丰富的元素,并计算丰度:

abundant element: a 丰富的元素:a
abundance: 1 丰富:1

  1. Generate all permutations of the firsts where the number of elements after the abundant element is not less than the abundance: (so in the example the "a" cannot be last) 生成第一个的所有排列,其中丰富元素之后的元素数量不小于丰度:(因此在示例中“a”不能是最后的)

abcd, abdc, acbd, acdb, adbc, adcb, bacd, badc, bcad, bcda , bdac, bdca , abcd,abdc,acbd,acdb,adbc,adcb,bacd,badc,bcad, bcda ,bdac, bdca
cabd, cadb, cbad, cbda , cdab, cdba , dabc, dacb, abac, dbca , dcab, dcba cabd,cadb,cbad, cbda ,cdab, cdba ,dabc,dacb,abac, dbca ,dcab, dcba

  1. For each permutation, insert the set of repeated characters one by one, following these rules: 对于每个排列,按照以下规则逐个插入重复字符集:

5.1. 5.1。 If the abundance of the set is greater than the number of elements after the last occurance of the abundant element in the permutation so far, skip to the next permutation. 如果集合的丰度大于到目前为止在置换中丰富元素的最后一次出现之后的元素数量,则跳到下一个置换。
eg when permutation so far is abc , a set with abundant element a can only be inserted if the abundance is 2 or less, so aaaabc is ok, aaaaabc isn't. 例如,当到目前为止的排列是abc ,如果丰度为2或更小,则只能插入具有丰富元素a的集合,因此aaaabc是可以的, aaaaabc不是。

5.2. 5.2。 Select the element from the set whose last occurance in the permutation comes first. 从集合中的最后一次出现的元素中选择元素。
eg when permutation so far is abcba and set is ab , select b 例如,当到目前为止的排列是abcba并且设置为ab ,选择b

5.3. 5.3。 Insert the selected element at least 2 positions to the right of its last occurance in the permutation. 将所选元素插入排列中最后一次出现的右侧至少2个位置。
eg when inserting b into permutation babca , results are babcba and babcab 例如,当将b插入排列babca ,结果是babcbababcab

5.4. 5.4。 Recurse step 5 with each resulting permutation and the rest of the set. 递归步骤5,每个结果排列和集合的其余部分。

EXAMPLE:
set = abcaba
firsts = abc
repeats = aab

perm3  set    select perm4  set    select perm5  set    select perm6

abc    aab    a      abac   ab     b      ababc  a      a      ababac  
                                                               ababca  
                                          abacb  a      a      abacab  
                                                               abacba  
                     abca   ab     b      abcba  a      -
                                          abcab  a      a      abcaba  
acb    aab    a      acab   ab     a      acaba  b      b      acabab  
                     acba   ab     b      acbab  a      a      acbaba  
bac    aab    b      babc   aa     a      babac  a      a      babaca  
                                          babca  a      -
                     bacb   aa     a      bacab  a      a      bacaba  
                                          bacba  a      -  
bca    aab    -
cab    aab    a      caba   ab     b      cabab  a      a      cababa  
cba    aab    -

This algorithm generates unique permutations. 该算法生成唯一的排列。 If you want to know the total number of permutations (where aba is counted twice because you can switch the a's), multiply the number of unique permutations with a factor: 如果你想知道排列的总数(其中aba被计算两次因为你可以切换a),请将唯一排列的数量乘以一个因子:

F = N 1 ! F = N 1 * N 2 ! * N 2 * ... * N n ! * ... * N n

where N is the number of occurances of each element in the set. 其中N是集合中每个元素的出现次数。 For a set abcdabcaba this would be 4! 对于一套abcdabcaba这将是4! * 3! * 3! * 2! * 2! * 1! * 1! or 288, which demonstrates how inefficient an algorithm is that generates all permutations instead of only the unique ones. 或者288,它表明算法是多么低效,它产生所有排列而不仅仅是唯一排列。 To list all permutations in this case, just list the unique permutations 288 times :-) 要列出这种情况下的所有排列,只需列出288次唯一排列:-)

Below is a (rather clumsy) implementation in Javascript; 下面是Javascript中的一个(相当笨拙)的实现; I suspect that a language like Python may be better suited for this sort of thing. 我怀疑像Python这样的语言可能更适合这类事情。 Run the code snippet to calculate the seperated permutations of "abracadabra". 运行代码段以计算“abracadabra”的分隔排列。

 // FIND ALL PERMUTATONS OF A SET WHERE NO ADJACENT ELEMENTS ARE IDENTICAL function seperatedPermutations(set) { var unique = 0, factor = 1, firsts = [], repeats = [], abund; seperateRepeats(set); abund = abundance(repeats); permutateFirsts([], firsts); alert("Permutations of [" + set + "]\\ntotal: " + (unique * factor) + ", unique: " + unique); // SEPERATE REPEATED CHARACTERS AND CALCULATE TOTAL/UNIQUE RATIO function seperateRepeats(set) { for (var i = 0; i < set.length; i++) { var first, elem = set[i]; if (firsts.indexOf(elem) == -1) firsts.push(elem) else if ((first = repeats.indexOf(elem)) == -1) { repeats.push(elem); factor *= 2; } else { repeats.splice(first, 0, elem); factor *= repeats.lastIndexOf(elem) - first + 2; } } } // FIND ALL PERMUTATIONS OF THE FIRSTS USING RECURSION function permutateFirsts(perm, set) { if (set.length > 0) { for (var i = 0; i < set.length; i++) { var s = set.slice(); var e = s.splice(i, 1); if (e[0] == abund.elem && s.length < abund.num) continue; permutateFirsts(perm.concat(e), s, abund); } } else if (repeats.length > 0) { insertRepeats(perm, repeats); } else { document.write(perm + "<BR>"); ++unique; } } // INSERT REPEATS INTO THE PERMUTATIONS USING RECURSION function insertRepeats(perm, set) { var abund = abundance(set); if (perm.length - perm.lastIndexOf(abund.elem) > abund.num) { var sel = selectElement(perm, set); var s = set.slice(); var elem = s.splice(sel, 1)[0]; for (var i = perm.lastIndexOf(elem) + 2; i <= perm.length; i++) { var p = perm.slice(); p.splice(i, 0, elem); if (set.length == 1) { document.write(p + "<BR>"); ++unique; } else { insertRepeats(p, s); } } } } // SELECT THE ELEMENT FROM THE SET WHOSE LAST OCCURANCE IN THE PERMUTATION COMES FIRST function selectElement(perm, set) { var sel, pos, min = perm.length; for (var i = 0; i < set.length; i++) { pos = perm.lastIndexOf(set[i]); if (pos < min) { min = pos; sel = i; } } return(sel); } // FIND ABUNDANT ELEMENT AND ABUNDANCE NUMBER function abundance(set) { if (set.length == 0) return ({elem: null, num: 0}); var elem = set[0], max = 1, num = 1; for (var i = 1; i < set.length; i++) { if (set[i] != set[i - 1]) num = 1 else if (++num > max) { max = num; elem = set[i]; } } return ({elem: elem, num: 2 * max - set.length}); } } seperatedPermutations(["a","b","r","a","c","a","d","a","b","r","a"]); 

The idea is to sort the elements from the most common to the least common, take the most common, decrease its count and put it back in the list keeping the descending order (but avoiding putting the last used element first to prevent repetitions when possible). 我们的想法是将元素从最常见的元素排序到最不常见的元素,最常见,减少其计数并将其放回列表中,保持降序(但避免将最后使用的元素放在第一位以防止重复) 。

This can be implemented using Counter and bisect : 这可以使用Counterbisect

from collections import Counter
from bisect import bisect

def unsorted(lst):
    # use elements (-count, item) so bisect will put biggest counts first
    items = [(-count, item) for item, count in Counter(lst).most_common()]
    result = []

    while items:
        count, item = items.pop(0)
        result.append(item)
        if count != -1:
            element = (count + 1, item)
            index = bisect(items, element)
            # prevent insertion in position 0 if there are other items
            items.insert(index or (1 if items else 0), element)

    return result

Example

>>> print unsorted([1, 1, 1, 2, 3, 3, 2, 2, 1])
[1, 2, 1, 2, 1, 3, 1, 2, 3]

>>> print unsorted([1, 2, 3, 2, 3, 2, 2])
[2, 3, 2, 1, 2, 3, 2]
  1. Sort the list. 对列表进行排序。
  2. Generate a "best shuffle" of the list using this algorithm 使用此算法生成列表的“最佳洗牌”

It will give the minimum of items from the list in their original place (by item value) so it will try, for your example, to put the 1's, 2's and 3's away from their sorted positions. 它将从列表中的原始位置(按项目值)给出最少的项目,因此,对于您的示例,它将尝试将1,2和3放置在其排序位置之外。

Start with the sorted list of length n. 从长度为n的排序列表开始。 Let m=n/2. 设m = n / 2。 Take the values at 0, then m, then 1, then m+1, then 2, then m+2, and so on. 取值为0,然后是m,然后是1,然后是m + 1,然后是2,然后是m + 2,依此类推。 Unless you have more than half of the numbers the same, you'll never get equivalent values in consecutive order. 除非你有超过一半的数字相同,否则你永远不会得到连续顺序的等价值。

Please forgive my "me too" style answer, but couldn't Coady's answer be simplified to this? 请原谅我的“我也是”风格的答案,但Coady的答案难道不能简化到这个吗?

from collections import Counter
from heapq import heapify, heappop, heapreplace
from itertools import repeat

def srgerg(data):
    heap = [(-freq+1, value) for value, freq in Counter(data).items()]
    heapify(heap)

    freq = 0
    while heap:
        freq, val = heapreplace(heap, (freq+1, val)) if freq else heappop(heap)
        yield val
    yield from repeat(val, -freq)

Edit: Here's a python 2 version that returns a list: 编辑:这是一个返回列表的python 2版本:

def srgergpy2(data):
    heap = [(-freq+1, value) for value, freq in Counter(data).items()]
    heapify(heap)

    freq = 0
    result = list()
    while heap:
        freq, val = heapreplace(heap, (freq+1, val)) if freq else heappop(heap)
        result.append(val)
    result.extend(repeat(val, -freq))
    return result
  1. Count number of times each value appears 计算每个值出现的次数
  2. Select values in order from most frequent to least frequent 从最频繁到最不频繁的顺序选择值
  3. Add selected value to final output, incrementing the index by 2 each time 将所选值添加到最终输出,每次将索引递增2
  4. Reset index to 1 if index out of bounds 如果索引超出范围,则将索引重置为1
from heapq import heapify, heappop
def distribute(values):
    counts = defaultdict(int)
    for value in values:
        counts[value] += 1
    counts = [(-count, key) for key, count in counts.iteritems()]
    heapify(counts)
    index = 0
    length = len(values)
    distributed = [None] * length
    while counts:
        count, value = heappop(counts)
        for _ in xrange(-count):
            distributed[index] = value
            index = index + 2 if index + 2 < length else 1
    return distributed

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM