简体   繁体   English

为什么 Java 排序优于原语的计数排序

[英]Why does Java sort outperform Counting sort for primitives

I am comparing the running time of Counting sort with Java native Arrays.sort.我正在将计数排序的运行时间与 Java 原生 Arrays.sort 进行比较。 Form what I've read the Counting sort offers a best, average and worst case n+k running time.从我读过的内容来看,计数排序提供了最佳、平均和最坏情况下的 n+k 运行时间。 Javas Arrays sort of primitives, using a dual - pivot Quicksort, is a comparison based algorithm thus must offer a O(n log n) in the average case, and On2 worst case. Javas Arrays 类型的原语,使用双 pivot 快速排序,是一种基于比较的算法,因此在平均情况下必须提供 O(n log n),而在最坏情况下提供 On2。

When comparing the two by measuring time (nanoseconds) taken to sort a series of arrays ranging in size 500 to 100k, I noted a sharp increase in running time for the Counting sort when the size reached ~70k.当通过测量对大小为 500 到 100k 的一系列 arrays 进行排序所花费的时间(纳秒)比较两者时,我注意到当大小达到 ~70k 时,计数排序的运行时间急剧增加。

My understanding is the Counting sort is efficient as long as the range of input data is not significantly greater then the number of elements to be sorted The arrays are built from random numbers between 0 and 99, so k will always be much smaller than n.我的理解是计数排序是有效的,只要输入数据的范围不明显大于要排序的元素数 arrays 是由 0 到 99 之间的随机数构建的,因此 k 将始终比 n 小得多。

Would there be any particular reason the Counting sort would degenerate so abruptly as n increases?随着 n 的增加,计数排序会突然退化有什么特别的原因吗?

以纳秒为单位的运行时间 (y) 与数组大小 (x)

My counting sort implementation:我的计数排序实现:

public static int[] countSort(int[] arr, int k) {
        /*
         * This will only work on positive integers 0 to K.
         * For negative  worst case testing we use the negative count sort below.
         * 
         * Step 1: Use an array to store the frequency of each element. For array
         * elements 1 to K initialize an array with size K. Step 2: Add elements of
         * count array so each element stores summation of its previous elements. Step
         * 3: The modified count array stores the position of elements in actual sorted
         * array. Step 5: Iterate over array and position elements in correct position
         * based on modified count array and reduce count by 1.
         */

        int[] result = new int[arr.length];
        int[] count = new int[k + 1];
        for (int x = 0; x < arr.length; x++) {
            count[arr[x]]++;
        }

        /*
         * Store count of each element in the count array Count[y] holds the number of
         * values of y in the array 'arr'
         */

        for (int y = 1; y <= k; y++) {
            count[y] += count[y - 1];
        }

        /*
         * Change count[i] so that count[i] now contains actual Position of this element
         * in result array
         */

        for (int i = arr.length - 1; i >= 0; i--) {
            result[count[arr[i]] - 1] = arr[i];
            count[arr[i]]--;
        }

        System.out.println("COUNTSORT SORTED ARRAY = " + Arrays.toString(result));
        return result;

    }

Resolution: Running the Counting sort in place as per @Alex's suggestion resulted in a far more superior run time.解决方案:按照@Alex 的建议运行 Counting 排序会产生更优越的运行时间。

修改就地计数排序与 Java 原始排序的运行时间

Just a guess, but your sorting algorithm uses much more memory than Java's.只是一个猜测,但您的排序算法使用的 memory 比 Java 的多得多。 70k of ints are 280KB. 70k 个整数是 280KB。 You need double the space, more than 512KB.你需要双倍的空间,超过 512KB。 Depending on the processor used, that could make the difference between running the sort in (L1?) cache and having lots of cache misses.根据所使用的处理器,这可能会在(L1?)缓存中运行排序和有很多缓存未命中之间产生差异。 Since you don't really need the copy, do the sort in place.由于您实际上并不需要副本,因此请就地进行排序。 If you now hit the wall later, you have the answer.如果你现在碰壁,你就有答案了。

Edit: it's 280KB.编辑:它是 280KB。

Edit2: It was late yesterday, so here comes the in-place version. Edit2:昨天很晚,所以这里是就地版本。 Note that it modifies the input array.请注意,它会修改输入数组。

public static int[] countSortRefactored(int[] arr, int k) {
    int[] count = new int[k + 1];
    for (int x = 0; x < arr.length; x++) {
        count[arr[x]]++;
    }

    int idx=0;
    for (int x=0; x<=k; x++) {
        Arrays.fill(arr, idx, idx+=count[x], x);
    }

    System.out.println("COUNTSORT SORTED ARRAY = " + Arrays.toString(arr));
    return arr;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM