简体   繁体   English

为重复的 Java 优化 QuickSort 分区

[英]Optimizing QuickSort partition for duplicates Java

I'm trying to optimize my partition algorithm to sort arrays full of duplicate elements faster since it goes in a sort of infinite loop if the array is ALL duplicates.我正在尝试优化我的分区算法以更快地对充满重复元素的数组进行排序,因为如果数组全部重复,它就会进入一种无限循环。 The only thing I can think of is doing firstunknown++ every time any adjacent elements are duplicates but I have no idea where or how to implement that in my code.我唯一能想到的就是每次任何相邻元素重复时都执行 firstunknown++ ,但我不知道在我的代码中在哪里或如何实现它。

Any help would be appreciated, thank you.任何帮助将不胜感激,谢谢。

 

One solution might be to remove duplicates, then do quick sort, and then add them back.一种解决方案可能是删除重复项,然后进行快速排序,然后将它们添加回来。 The removal and addition of duplicates can be done in linear time.重复项的删除和添加可以在线性时间内完成。

Removing duplicates删除重复项

This method removes duplicates from the given array and returns a map of <number in array, count>:此方法从给定数组中删除重复项并返回 <number in array, count> 的映射:

public static Map<Integer, Integer> removeDuplicates(Integer[] arr) {
    
    LinkedList<Integer> duplicatesRemoved = new LinkedList<>();
    HashMap<Integer, Integer> counts = new HashMap<>();
    
    for (Integer n: arr) {
        if (counts.containsKey(n)) {
            counts.put(n, counts.get(n)+1);
        } else {
            counts.put(n, 1);
            duplicatesRemoved.add(n);
        }
    }
    
    return duplicatesRemoved.toArray(new Integer[0]);
}

Adding duplicates back into sorted array将重复项添加回已排序的数组

Once you do your quicksort, you can add the duplicates back in (again, in linear time):完成快速排序后,您可以将重复项添加回(再次,以线性时间):

public static Integer[] expand(Integer[] arr, HashMap<Integer, Integer> counts) {
      
    LinkedList<Integer> expanded = new LinkedList<>();
      
    for (Integer n: arr) {
        Integer count = counts.get(n);
        for (int i=0; i < count; i++) {
            expanded.add(n);
        }
    }
      
    return expanded.toArray(new Integer[0]);
}

This will save you complex work in handling duplicates within the sort.这将节省您在排序中处理重复项的复杂工作。

Have you considered using 3-way partition quick sort?您是否考虑过使用 3 路分区快速排序? it works faster if there are duplicates.如果有重复,它工作得更快。 https://medium.com/@nehasangeetajha/3-way-quick-sort-18d2dcc5b06b https://medium.com/@nehasangeetajha/3-way-quick-sort-18d2dcc5b06b

private static void quickSort(int[] arr, int l,int r){
    if(l >= r){
        return;
    }
    int value = arr[l];
    int lt = l;
    int gt = r;
    int i = l+1;
    while(i <= gt){
        if(arr[i] < value){
            swap(arr, i++, lt++);
        }
        else if(arr[i] > value){
            swap(arr,i, gt--);
        }
        else{
            i++;
        }
    }
    quickSort(arr, l, lt-1);
    quickSort(arr, gt+1, r);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM