使用数组列表的排序算法

Question

I'm currently working on sorting algorithms with array lists.我目前正在研究使用数组列表的排序算法。 I have seen a project on GitHub to overwrite the first (smallest) element in a small array with a new (larger) element from a bigger array, and then to sort the small array.我在 GitHub 上看到一个项目，用一个大数组中的新（大）元素覆盖小数组中的第一个（最小）元素，然后对小数组进行排序。

This is the solution provided:这是提供的解决方案：

public int findLarger() throws IndexingError {
    int[] array = getArray();
    int k = getIndex();
    if (k <= 0 || k > array.length) {
        throw new IndexingError();
    }
    int[] smallArray = new int[k];
    for (int index = k; index < array.length; index++){
         if (array[index] > smallArray[0]){
             smallArray[0] = array[index];
             Arrays.sort(smallArray);
         }
    }
    return smallArray[0];
}

But I'm struggling to understand if this method I have created is more 'efficient', by using another variable instead of another array?但是我很难理解我创建的这个方法是否通过使用另一个变量而不是另一个数组更“有效”？

public int findLarger() throws IndexingError {
    int[] array = getArray();
    int max = array[0];
    for (int i = 0; i < k; i++) {
        if (max < array[i]) {
            max = array[i];
        }
    } 
    return max;
}


public abstract class Search {
private int[] array; 
private int k; 

Search(int[] array, int k) {
    this.array = array;
    this.k = k;
}

public int[] getArray() {
    return array;
}
int getIndex() { return k; }

abstract public int findElement() throws IndexingError;
}

edit:编辑：

if (array.length == 0 )
            throw new RuntimeException("Array can't be empty");

        int max = array[0];
        for (int i = 0; i < array.length; i++) {
            if (max < array[i]) {
                max = array[i];
            }
        } // end of obvious solution method
        return max;
    }

Answer 1

The first implementation is really bad, why would one sort (O n log n) even worst several array.length-k times, to find the minimum (O(n)) of a set is just awful.第一个实现真的很糟糕，为什么一个排序 (O n log n) 甚至最糟糕的几个array.length-k次，找到一个集合的最小值 (O(n)) 是很糟糕的。

So yes, a version with a single variable, storing the current minimum is the correct way to go.所以是的，具有单个变量的版本，存储当前最小值是正确的方法。 (Just take care that initializing your max with array[0] is not resistant to empty inputs) （请注意，使用array[0]初始化最大值对空输入没有抵抗力）

On the other hand, as others have commented, the two algorithms are not using the same cells, and are thus currently incomparable.另一方面，正如其他人评论的那样，这两种算法没有使用相同的单元格，因此目前无法比较。 If in your second implementation you iterate from k to array.length like the first one, you do get a much better implementation than the first one.如果在你的第二个实现中，你像第一个一样从k迭代到array.length ，你会得到比第一个更好的实现。

Answer 2

This is a difficult question to answer for two reasons:这是一个很难回答的问题，原因有二：

Firstly, method #1 and method #2 do not do the same thing, so comparing their efficiency doesn't really make sense.首先，方法#1 和方法#2 不做同样的事情，所以比较它们的效率没有意义。
Secondly, what method #1 actually does is a bit difficult to pin down exactly, and it's not clear that what it actually does is the same as what it should do.其次，方法#1 实际做什么有点难以准确确定，并且不清楚它实际做什么与它应该做什么相同。 That is, method #1 is not just a solution to a different problem;也就是说，方法#1 不仅仅是对不同问题的解决方案； I suspect it is an incorrect solution to a different problem .我怀疑这是对不同问题的不正确解决方案。

Let me explain.让我解释。 Method #2 is quite straightforward: it finds the maximum element from the subarray array[0..k] .方法#2 非常简单：它从子array[0..k]找到最大元素。 Method #1 clearly does not do this: it only reads data from the subarray array[k..n] .方法 #1 显然不会这样做：它只从子array[k..n]读取数据。

It also clearly isn't finding the maximum from that subarray, because it puts data into smallArray , sorts it, and returns the value from index 0;它也显然没有从该子smallArray找到最大值，因为它将数据放入smallArray ，对其进行排序，然后从索引 0 中返回值； the maximum would be at index k - 1 .最大值将在索引k - 1 。 But the value at index 0 is also not the minimum, since data only gets put into smallArray if it's bigger than what's already there.但在指数值为0也不是最小的，因为数据只被投入smallArray如果它比已经存在更大。

The actual behaviour of method #1 can be investigated using examples.可以使用示例来研究方法 #1 的实际行为。 For convenience, I've changed the signature to take array and k as parameters:为方便起见，我更改了签名以将array和k作为参数：

findLarger(new int[] { 1, 2, 3, 4, 5, 6, 7 }, 3) is 5: the third-largest of 4, 5, 6, 7. findLarger(new int[] { 1, 2, 3, 4, 5, 6, 7 }, 3)是 5：4, 5, 6, 7 中的第三大。
findLarger(new int[] { 1, 2, 3, 7, 6, 5, 4 }, 3) is also 5: the third-largest of 7, 6, 5, 4. findLarger(new int[] { 1, 2, 3, 7, 6, 5, 4 }, 3)也是 5：7, 6, 5, 4 中的第三大。
findLarger(new int[] { 7, 6, 5, 4, 3, 2, 1 }, 3) is 2: the third-largest of 4, 3, 2, 1. findLarger(new int[] { 7, 6, 5, 4, 3, 2, 1 }, 3)是 2：4, 3, 2, 1 中的第三大。
findLarger(new int[] { 1, 2, 3, 7, 6, 5, 4 }, 1) is 7: the first-largest of 2, 3, 7, 6, 5, 4. findLarger(new int[] { 1, 2, 3, 7, 6, 5, 4 }, 1)是 7：2, 3, 7, 6, 5, 4 中的第一大。

For these examples, it consistently returns the k th largest element in the subarray array[k..n] .对于这些示例，它始终返回子array[k..n]中的第k个最大元素。 However, in other cases, it doesn't:但是，在其他情况下，它不会：

findLarger(new int[] { -1, -2, -3, -4, -5, -6 }, 2) is 0, not one of -3, -4, -5, -6. findLarger(new int[] { -1, -2, -3, -4, -5, -6 }, 2)是 0，不是 -3, -4, -5, -6 之一。
findLarger(new int[] { 1, 2, 3, 4, 5, 6, 7 }, 5) is 0, not one of 6, 7. findLarger(new int[] { 1, 2, 3, 4, 5, 6, 7 }, 5)是 0，不是 6、7 之一。

So the full statement of what method #1 does is: it returns the k th largest positive element from the subarray array[k..n] , or 0 if this sub-array contains fewer than k positive numbers .因此，方法 #1 所做的完整说明是：它返回子array[k..n]第k个最大的正元素，如果该子数组包含少于k正数，则返回 0 。 The special case of returning 0, and the use of k for two unrelated purposes, suggests that this method was supposed to solve the more straightforward problem of returning the k th largest element, but that it was written incorrectly.返回 0 的特殊情况，以及将k用于两个不相关的目的，表明该方法应该解决返回第k个最大元素的更直接的问题，但它写错了。

Further evidence for this is that a very simple change to the algorithm makes it unconditionally return the k th largest element: instead of initialising smallArray with zeroes, copy the first k elements from array and sort them.对此的进一步证据是，对算法的一个非常简单的更改使其无条件返回第k个最大元素：不是用零初始化smallArray ，而是从array复制前k元素并对其进行排序。

    // changed: copy first k elements from array, and sort
    int[] smallArray = Arrays.copyOfRange(array, 0, k);
    Arrays.sort(smallArray);

    for (int index = k; index < array.length; index++){
         if (array[index] > smallArray[0]){
             smallArray[0] = array[index];
             Arrays.sort(smallArray);
         }
    }
    return smallArray[0];

Even more evidence is the similarity with the code in this other Stack Overflow question , which is meant to find the k th largest element, and which does the copyOfRange and sort instead of just new int[k] .更多的证据是与另一个 Stack Overflow 问题中的代码的相似性，它旨在找到第k个最大的元素，并且执行copyOfRange和sort而不仅仅是new int[k] 。

So now we can talk about the efficiency of alternatives to the fixed version of method #1.所以现在我们可以谈谈方法#1 的固定版本的替代方案的效率。

The time complexity of method #1 is O( nk log k ).方法#1 的时间复杂度是 O( nk log k )。
Method #1 can be improved to O( nk ) by changing Arrays.sort in the inner loop to shift the first element to its correct position in O( k ) time;通过在内部循环中更改Arrays.sort以在 O( k ) 时间内将第一个元素移动到其正确位置，可以将方法 #1 改进为 O( nk )； this works because only the first element will be out of order, so a full sort is unnecessary.这是有效的，因为只有第一个元素会乱序，所以完全排序是不必要的。
The obvious way to find the k th largest element is to sort the array and return the value at index n - k .找到第k个最大元素的显而易见的方法是对数组进行排序并返回索引n - k处的值。 This takes O( n log n ) time;这需要 O( n log n ) 时间； method #1 is only better when k log k < log n , ie when k is small compared to n .方法#1 仅在k log k < log n时更好，即当k与n相比较小时。
You can do better - the quickselect algorithm takes just O( n ) time on average, which is clearly optimal for this problem.你可以做得更好 - quickselect算法平均只需要 O( n ) 时间，这显然是这个问题的最佳选择。 However, it has a rare worst-case complexity of O( n ²).然而，它具有为O（n²）的罕见的最坏情况复杂。

使用数组列表的排序算法

问题描述

2 个解决方案

解决方案1
2 2019-12-11 16:38:28

解决方案2
1 2019-12-11 21:46:28

使用数组列表的排序算法

问题描述

2 个解决方案

解决方案1 2 2019-12-11 16:38:28

解决方案2 1 2019-12-11 21:46:28

解决方案1
2 2019-12-11 16:38:28

解决方案2
1 2019-12-11 21:46:28