简体   繁体   English

Python:中位数为三的快速排序

[英]Python: Quicksort with median of three

I'm trying to change this quicksort code to work with a pivot that takes a "median of three" instead.我正在尝试更改此快速排序代码以使用采用“三的中位数”的枢轴。

def quickSort(L, ascending = True): 
    quicksorthelp(L, 0, len(L), ascending)


def quicksorthelp(L, low, high, ascending = True): 
    result = 0
    if low < high: 
        pivot_location, result = Partition(L, low, high, ascending)  
        result += quicksorthelp(L, low, pivot_location, ascending)  
        result += quicksorthelp(L, pivot_location + 1, high, ascending)
    return result


def Partition(L, low, high, ascending = True):
    print('Quicksort, Parameter L:')
    print(L)
    result = 0 
    pivot, pidx = median_of_three(L, low, high)
    L[low], L[pidx] = L[pidx], L[low]
    i = low + 1
    for j in range(low+1, high, 1):
        result += 1
        if (ascending and L[j] < pivot) or (not ascending and L[j] > pivot):
            L[i], L[j] = L[j], L[i]  
            i += 1
    L[low], L[i-1] = L[i-1], L[low] 
    return i - 1, result

liste1 = list([3.14159, 1./127, 2.718, 1.618, -23., 3.14159])

quickSort(liste1, False)  # descending order
print('sorted:')
print(liste1)

But I'm not really sure how to do that.但我不确定该怎么做。 The median has to be the median of the first, middle and last element of a list.中位数必须是列表的第一个、中间和最后一个元素的中位数。 If the list has an even number of elements, middle becomes the last element of the first half.如果列表有偶数个元素,则中间成为前半部分的最后一个元素。

Here's my median function:这是我的中值函数:

def median_of_three(L, low, high):
    mid = (low+high-1)//2
    a = L[low]
    b = L[mid]
    c = L[high-1]
    if a <= b <= c:
        return b, mid
    if c <= b <= a:
        return b, mid
    if a <= c <= b:
        return c, high-1
    if b <= c <= a:
        return c, high-1
    return a, low

Let us first implement the median-of-three for three numbers, so an independent function. 让我们首先为三个数字实现三个中值,这是一个独立的函数。 We can do that by sorting the list of three elements, and then return the second element, like: 我们可以通过排序三个元素的列表来做到这一点,然后返回第二个元素,如:

def median_of_three(a, b, c):
    return sorted([a, b, c])[1]

Now for a range low .. high (with low included, and high excluded), we should determine what the elements are for which we should construct the median of three: 现在,对于范围low .. highlow包含, high排除),我们应该确定我们应该构造三个中位数的元素:

  1. the first element: L[low] , 第一个元素: L[low]
  2. the last element L[high-1] , and 最后一个元素L[high-1] ,和
  3. the middle element (in case there are two such, take the first) L[(low+high-1)//2] . 中间元素(如果有两个这样,取第一个) L[(low+high-1)//2]

So now we only need to patch the partitioning function to: 所以现在我们只需要将分区功能修补为:

def Partition(L, low, high, ascending = True):
    print('Quicksort, Parameter L:')
    print(L)
    result = 0 
    pivot = median_of_three(L[low], L[(low+high-1)//2], L[high-1])
    i = low + 1  
    for j in range(low + 1, high, 1): 
        result += 1
        if (ascending and L[j] < pivot) or (not ascending and L[j] > pivot):
            L[i], L[j] = L[j], L[i]  
            i += 1  
    L[low], L[i-1] = L[i-1], L[low] 
    return i - 1, result

EDIT : determining the median of three elements. 编辑 :确定三个元素的中位数。

The median of three elements is the element that is in the middle of the two other values. 三个元素的中位数是位于另外两个值中间的元素。 So in case a <= b <= c , then b is the median. 因此,如果a <= b <= c ,则b是中位数。

So we need to determine in what order the elements are, such that we can determine the element in the middle. 因此,我们需要确定元素的顺序,以便我们可以确定中间的元素。 Like: 喜欢:

def median_of_three(a, b, c):
    if a <= b and b <= c:
        return b
    if c <= b and b <= a:
        return b
    if a <= c and c <= b:
        return c
    if b <= c and c <= a:
        return c
    return a

So now we have defined the median of three with four if cases. 所以现在我们已经定义了三个中位数, if是4个案例。

EDIT2 : There is still a problem with this. 编辑2 :这仍然存在问题。 After you perform a pivot, you swap the element L[i-1] with L[low] in your original code (the location of the pivot). 执行转轴后,将元素L[i-1]与原始代码中的L[low]交换(转轴的位置)。 But this of course does not work anymore: since the pivot now can be located at any of the three dimensions. 但这当然不再起作用了:因为枢轴现在可以位于三个维度中的任何一个。 Therfore we need to make the median_of_three(..) smarter: not only should it return the pivot element, but the location of that pivot as well: 因此,我们需要使median_of_three(..)更智能:不仅应该返回枢轴元素,还要返回该枢轴的位置:

def median_of_three(L, low, high):
    mid = (low+high-1)//2
    a = L[low]
    b = L[mid]
    c = L[high-1]
    if a <= b <= c:
        return b, mid
    if c <= b <= a:
        return b, mid
    if a <= c <= b:
        return c, high-1
    if b <= c <= a:
        return c, high-1
    return a, low

Now we can solve this problem with: 现在我们可以解决这个问题:

def Partition(L, low, high, ascending = True):
    print('Quicksort, Parameter L:')
    print(L)
    result = 0 
    pivot, pidx = median_of_three(L, low, high)
    i = low + (low == pidx)
    for j in range(low, high, 1):
        if j == pidx: continue
        result += 1
        if (ascending and L[j] < pivot) or (not ascending and L[j] > pivot):
            L[i], L[j] = L[j], L[i]  
            i += 1 + (i+1 == pidx)
    L[pidx], L[i-1] = L[i-1], L[pidx] 
    return i - 1, result

EDIT3 : cleaning it up. 编辑3 :清理它。

Although the above seems to work, it is quite complicated: we need to let i and j "skip" the location of the pivot. 虽然上面似乎有用,但它很复杂:我们需要让ij “跳过”枢轴的位置。

It is probably simpler if we first move the pivot to the front of the sublist (so to the low index): 如果我们首先将枢轴移动到子列表的前面(对于low索引),这可能更简单:

def Partition(L, low, high, ascending = True):
    print('Quicksort, Parameter L:')
    print(L)
    result = 0 
    pivot, pidx = median_of_three(L, low, high)
    L[low], L[pidx] = L[pidx], L[low]
    i = low + 1
    for j in range(low+1, high, 1):
        result += 1
        if (ascending and L[j] < pivot) or (not ascending and L[j] > pivot):
            L[i], L[j] = L[j], L[i]  
            i += 1
    L[low], L[i-1] = L[i-1], L[low] 
    return i - 1, result

In a "median of three" version of quicksort, you do not only want to find the median to use it as the pivot, you also want to place the maximum and the minimum values in their places so some of the pivoting is already done. 在快速排序的“三个中位数”版本中,您不仅要查找将其用作枢轴的中位数,还要将最大值和最小值放在它们的位置,以便进行一些旋转。 In other words, you want to sort those three items in those three places. 换句话说,您想要在这三个地方对这三个项目进行排序。 (Some variations do not want them sorted in the usual way, but I'll stick to a simpler-to-understand version for you here.) (有些变体不希望它们以通常的方式排序,但我会在这里坚持一个更简单易懂的版本。)

You probably don't want to do this in a function, since function calls are fairly expensive in Python and this particular capability is not broadly useful. 您可能不希望在函数中执行此操作,因为函数调用在Python中相当昂贵,并且此特定功能并不广泛有用。 So you can do some code like this. 所以你可以做这样的代码。 Let's say the three values you want to sort are in indices i , j , and k , with i < j < k . 假设您想要排序的三个值在索引ijk ,其中i < j < k In practice you probably would use low , low + 1 , and high , but you can make those changes as you like. 在实践中,您可能会使用lowlow + 1high ,但您可以根据需要进行更改。

if L(i) > L(j):
    L(i), L(j) = L(j), L(i)
if L(i) > L(k):
    L(i), L(k) = L(k), L(i)
if L(j) > L(k):
    L(j), L(k) = L(k), L(j)

There are some optimizations that can be done. 可以做一些优化。 For example, you probably will want to use the median value in the pivot process, so you can change the code to have stored the final value of L(j) in a simple variable, which reduces array lookups. 例如,您可能希望在数据透视过程中使用中值,因此您可以更改代码以将L(j)的最终值存储在一个简单变量中,从而减少数组查找。 Note that you cannot do this in less than three comparisons in general--you cannot reduce it to two comparisons, though in some special cases you could do that. 请注意,一般情况下,您不能在少于三次比较中执行此操作 - 您无法将其减少为两次比较,但在某些特殊情况下您可以这样做。

one possible way can be selecting medians randomly from left and right positions.一种可能的方法是从左右位置随机选择中位数。

def median_of_three(left, right):
    """
    Function to choose pivot point
    :param left: Left index of sub-list
    :param right: right-index of sub-list
    """

    # Pick 3 random numbers within the range of the list
    i1 = left + random.randint(0, right - left)
    i2 = left + random.randint(0, right - left)
    i3 = left + random.randint(0, right - left)

    # Return their median
    return max(min(i1, i2), min(max(i1, i2), i3))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM