简体   繁体   English

在JavaScript中对32位有符号整数数组进行排序的最快方法?

[英]Fastest way to sort 32bit signed integer arrays in JavaScript?

_radixSort_0 = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
            0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];
/*
RADIX SORT
Use 256 bins
Use shadow array
- Get counts
- Transform counts to pointers
- Sort from LSB - MSB
*/
function radixSort(intArr) {
    var cpy = new Int32Array(intArr.length);
    var c4 = [].concat(_radixSort_0); 
    var c3 = [].concat(_radixSort_0); 
    var c2 = [].concat(_radixSort_0);
    var c1 = [].concat(_radixSort_0); 
    var o4 = 0; var t4;
    var o3 = 0; var t3;
    var o2 = 0; var t2;
    var o1 = 0; var t1;
    var x;
    for(x=0; x<intArr.length; x++) {
        t4 = intArr[x] & 0xFF;
        t3 = (intArr[x] >> 8) & 0xFF;
        t2 = (intArr[x] >> 16) & 0xFF;
        t1 = (intArr[x] >> 24) & 0xFF ^ 0x80;
        c4[t4]++;
        c3[t3]++;
        c2[t2]++;
        c1[t1]++;
    }
    for (x=0; x<256; x++) {
        t4 = o4 + c4[x];
        t3 = o3 + c3[x];
        t2 = o2 + c2[x];
        t1 = o1 + c1[x];
        c4[x] = o4;
        c3[x] = o3;
        c2[x] = o2;
        c1[x] = o1;
        o4 = t4;
        o3 = t3;
        o2 = t2;
        o1 = t1;
    }
    for(x=0; x<intArr.length; x++) {
        t4 = intArr[x] & 0xFF;
        cpy[c4[t4]] = intArr[x];
        c4[t4]++;
    }
    for(x=0; x<intArr.length; x++) {
        t3 = (cpy[x] >> 8) & 0xFF;
        intArr[c3[t3]] = cpy[x];
        c3[t3]++;
    }
    for(x=0; x<intArr.length; x++) {
        t2 = (intArr[x] >> 16) & 0xFF;
        cpy[c2[t2]] = intArr[x];
        c2[t2]++;
    }
    for(x=0; x<intArr.length; x++) {
        t1 = (cpy[x] >> 24) & 0xFF ^ 0x80;
        intArr[c1[t1]] = cpy[x];
        c1[t1]++;
    }
    return intArr;
}

EDIT: 编辑:

So far, the best/only major optimization brought to light is JS typed arrays. 到目前为止,最佳/唯一的主要优化是JS类型的数组。 Using a typed array for the normal radix sort's shadow array has yielded the best results. 对正常基数排序的阴影数组使用类型数组已经产生了最好的结果。 I was also able to squeeze a little extra out of the in place quick sort using JS built in stack push/pop. 我还能够使用JS内置堆栈push / pop来快速挤出一些额外的快速排序。


latest jsfiddle benchmark 最新的jsfiddle基准

Intel i7 870, 4GB, FireFox 8.0
2mil
radixSort(intArr): 172 ms
radixSortIP(intArr): 1738 ms
quickSortIP(arr): 661 ms
200k
radixSort(intArr): 18 ms
radixSortIP(intArr): 26 ms
quickSortIP(arr): 58 ms

It appears standard radix sort is indeed king for this work-flow. 似乎标准基数排序确实是这项工作流程的王者。 If someone has time to experiment with loop-unrolling or other modifications for it I would appreciate it. 如果有人有时间尝试循环展开或其他修改,我将不胜感激。

I have a specific use case where I'd like the fastest possible sorting implementation in JavaScript. 我有一个特定的用例,我希望在JavaScript中尽可能快地实现排序。 There will be large (50,000 - 2mil), unsorted (essentially random), integer (32bit signed) arrays that the client script will access, it then needs to sort and present this data. 客户端脚本将访问大型(50,000 - 2mil),未分类(基本上是随机的),整数(32位有符号)数组,然后需要对这些数据进行排序和显示。

I've implemented a fairly fast in place radix sort and in place quick sort jsfiddle benchmark but for my upper bound array length they are still fairly slow. 我已经实现了相当快速的基数排序和快速排序jsfiddle基准测试,但对于我的上限数组长度,它们仍然相当慢。 The quick sort performs better on my upper bound array size while the radix sort performs better on my lower bound. 快速排序在我的上限数组大小上表现更好,而基数排序在我的下限上表现更好。

defaultSort is the built-in JavaScript array.sort with an integer compare function

Intel C2Q 9650, 4GB, FireFox 3.6
2mil
radixSortIP(intArr): 5554 ms
quickSortIP(arr): 1796 ms
200k
radixSortIP(intArr): 139 ms
quickSortIP(arr): 190 ms
defaultSort(intArr): 354 ms

Intel i7 870, 4GB, FireFox 8.0
2mil
radixSortIP(intArr): 990 ms
quickSortIP(arr): 882 ms
defaultSort(intArr): 3632 ms
200k
radixSortIP(intArr): 28 ms
quickSortIP(arr): 68 ms
defaultSort(intArr): 306 ms

Questions 问题

  • Is there a better implementation of any sorting algorithm that would meet my use case/needs? 是否有更好的实现任何满足我的用例/需求的排序算法?
  • Are there any optimizations that can be made to my in place radix/quick sort implementations to improve performance? 是否可以对我的基础radix /快速排序实现进行任何优化以提高性能?
    • Is there an efficient way to convert my in place radix sort from a recursive to iterative function? 有没有一种有效的方法将我的原位基数排序从递归函数转换为迭代函数? Memory and execution speed. 内存和执行速度。

Goal 目标

  • I am hoping these answers will help me get ~20-30% performance improvement in my benchmark test. 我希望这些答案能帮助我在基准测试中获得约20-30%的性能提升。

Clarifications/Notes 澄清/注意事项

  • "DEFINE FAST" I would prefer a general case where it runs well on all modern browsers, but if there is a browser specific optimization that makes a significant improvement that may be acceptable. “快速定义”我更喜欢一种在所有现代浏览器上运行良好的一般情况,但是如果有一个特定于浏览器的优化可以实现可接受的重大改进。
  • The sorting COULD be done server side, but I'd prefer to avoid this because the JS app may become a standalone (paired with some off the shelf proprietary app that will stream sensor data to a file). 排序可以在服务器端完成,但我宁愿避免这种情况,因为JS应用程序可能会成为一个独立的(与一些现成的专有应用程序配对,将传感器数据传输到文件)。
  • JavaScript may not be the best language for this but it's a requirement. JavaScript可能不是最好的语言,但它是一个要求。
  • I've already asked this question https://stackoverflow.com/questions/7111525/fastest-way-to-sort-integer-arrays-in-javascript an incorrect answer was up-voted and the question was closed. 我已经问过这个问题https://stackoverflow.com/questions/7111525/fastest-way-to-sort-integer-arrays-in-javascript一个错误的答案已被投票,问题已经结束。
  • I've attempted using multiple browser window instances as a makeshift multi-threading; 我尝试使用多个浏览器窗口实例作为临时多线程; it didn't pan out. 它没有成功。 I'd be interested in useful info regarding spawning multiple windows for concurrency. 我对有关产生多个并发窗口的有用信息感兴趣。

I've tested typed arrays , the QSIP version seems to be good in modern browsers: 我测试了类型化数组 ,QSIP版本似乎在现代浏览器中很好:

2 000 000 elements 2 000 000个元素

          QSIP_TYPED | RDXIP_TYPED |  QSIP_STD | RDXIP_STD
----------------------------------------------------------
Chrome  |    300          1000          600        1300
Firefox |    550          1500          800        1600    

http://jsfiddle.net/u8t2a/35/ http://jsfiddle.net/u8t2a/35/

Support ( source: http://caniuse.com/typedarrays ): 支持来源: http //caniuse.com/typedarrays ):

 IE 10+   |   FF 4+  |  Chrome 7+  |  Safari 5.1+  |  Opera 11.6+   

Have you considered a combination of algorithms to maximize cache use? 您是否考虑过算法组合以最大化缓存使用? I saw in the benchmark that you are switching to insertion sort when the subarrays become small. 我在基准测试中看到,当子阵列变小时,您将切换到插入排序。 An interesting approach is to instead switch to heapsort. 一个有趣的方法是转而使用heapsort。 Used in conjunction with quicksort, it can bring down the worst case to O(nlog(n)) instead of O(n^2). 与quicksort结合使用时,它可以将最坏的情况降低到O(nlog(n))而不是O(n ^ 2)。 Have a look at Introsort . 看看Introsort

I fiddled with your benchmark and added my own sort function. 我摆弄你的基准并添加了我自己的排序功能。 It performs same as radixsort, but it's idea (and implementation) is simpler, it is like a radixsort, but in binary, so you only have two bucket and can do it in-place. 它与radixsort执行相同,但它的想法(和实现)更简单,它就像一个radixsort,但在二进制中,所以你只有两个桶,可以就地做到这一点。 Look at http://jsfiddle.net/KaRV9/7/ . 看看http://jsfiddle.net/KaRV9/7/

I put my code in place of "Quicksort in place" (since it is very similar to quicksort, just pivot is selected other way). 我把我的代码放在“Quicksort就位”的位置(因为它与quicksort非常相似,只是以其他方式选择了pivot)。 Run them a few times, in my Chrome 15 they perform so close it is unable to distinguish them. 运行它们几次,在我的Chrome 15中它们执行得非常接近它无法区分它们。

I am not going to comment on your sorting algorithms. 我不打算评论你的排序算法。 you know more about those than me. 你比我更了解这些。

But a good idea would be to use web workers. 但是一个好主意是使用网络工作者。 This allows your sort to run in the background in it's own thread and thus not blocking the interface. 这允许您的排序在它自己的线程中在后台运行,因此不会阻止接口。 This would be good practise no matter what. 无论如何,这都是好习惯。 It is well supported for Chrome and Firefox. Chrome和Firefox都支持它。 Opera has a non-threaded version. Opera有一个非线程版本。 Not sure about the support in IE, but it would be easy to work around that. 不确定IE中的支持,但很容易解决这个问题。 There is of course overhead involved in using multiple threads. 当然,使用多个线程涉及开销。

Merge sort can be made easily into a multi-threaded version, which will give you some performance boost. 合并排序可以很容易地进入多线程版本,这将为您带来一些性能提升。 Messaging comes with a time penalty of course, so it will really depend on your specific situation if it will run faster. 消息传递当然会带来时间损失,因此如果它运行得更快,它将真正取决于您的具体情况。 Remember though, that the non-blocking nature might make it feel like the app is running faster for the end user. 但请记住,非阻塞性​​质可能会使应用程序对最终用户的运行速度更快。

EDIT: I see you're already using insertion sort for smaller subarrays. 编辑:我看到你已经使用插入排序较小的子阵列。 I missed that. 我错过了。

The good real-world approach with quicksort is to check the size of the subarray, and if it's short enough use a quick low-overhead sort that's too slow for larger arrays, such as insertion sort. 快速排序的良好实际方法是检查子阵列的大小,如果它足够短,则使用快速的低开销排序,对于较大的阵列(例如插入排序)来说太慢。

Pseudo-code is something along the lines of: 伪代码的含义如下:

quicksort(array, start, end):
  if ( (end-start) < THRESHOLD ) {
    insertion_sort(array, start, end);
    return;
  }
  // regular quicksort here
  // ...

To determine the THRESHOLD, you need to time it on the platforms you care about, in your case - possibly different browsers. 要确定THRESHOLD,您需要在您关注的平台上计时 - 在您的情况下 - 可能是不同的浏览器。 Measure the time for random arrays with different thresholds to find an close-to-optimal one. 测量具有不同阈值的随机数组的时间,以找到接近最佳的数组。 You could also choose different thresholds for different browsers if you find significant differences. 如果发现重大差异,您还可以为不同的浏览器选择不同的阈值。

Now if your inputs aren't really random (which is pretty common) you can see if better pivot selection improves performance. 现在,如果您的输入不是随机的(这很常见),您可以看到更好的数据透视选择是否可以提高性能。 A common method is the median of three . 一种常见的方法是三个中位数

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM