Scala quickSort使用Ordering [T]速度慢10倍

Question

I was doing some sorting of integer indices based on a custom Ordering. 我正在根据自定义排序对整数索引进行一些排序。 I found that the Ordering[T] used here makes the sort at least 10 times slower than handcrafted quickSort using direct calls to the compare method. 我发现这里使用的Ordering [T]使用直接调用compare方法比使用手工制作的quickSort慢至少10倍。 That seems outrageously costly! 这看起来非常昂贵！

val indices: Array[Int] = ...

class OrderingByScore extends Ordering[Int] { ... }

time { (0 to 10000).par.foreach(x => {
  scala.util.Sorting.quickSort[Int](indices.take(nb))(new OrderingByScore)
})}
// Elapsed: 30 seconds

Compared to the hand crafted sortArray found here but modified to add an ord: Ordering[Int] parameter: 与手工制作的sortArray相比，此处可以修改以添加ord: Ordering[Int]参数：

def sortArray1(array: Array[Int], left: Int, right: Int, ord: Ordering[Int]) = ...

time { (0 to 10000).par.foreach(x => {
  sortArray1(indices.take(nb), 0, nb - 1, new OrderingByScore)
})}
// Elapsed: 19 seconds

And finally, same piece of code but using exact type instead ( ord: OrderingByScore ): 最后，使用相同的代码但使用精确类型（ ord: OrderingByScore ）：

def sortArray2(array: Array[Int], left: Int, right: Int, ord: OrderingByScore) = ...

time { (0 to 10000).par.foreach(x => {
  sortArray2(indices.take(nb), 0, nb - 1, new OrderingByScore)
})}
// Elapsed: 1.85 seconds

I'm quite surprised to see such a difference between each versions! 我很惊讶地发现每个版本之间存在这样的差异！

In my example, indices array is sorted based on the values found in another Doubles array containing a combined scores. 在我的示例中，indices数组基于在包含组合分数的另一个Doubles数组中找到的值进行排序。 Also, the sorting is stable as it use the indices itself as a secondary comparison. 此外，排序是稳定的，因为它使用索引本身作为次要比较。 On a side note, to make testing reliable, I had to "indices.take(nb)" within the parallel loop since sorting modifies input array. 另外，为了使测试可靠，我必须在并行循环中使用“indices.take（nb）”，因为排序会修改输入数组。 This penalty in negligible compared to the problem that brings me here. 与我带来的问题相比，这种惩罚可以忽略不计。 Full code on gist here . 对要点的完整代码在这里。

Your suggestions are much welcomed to improve on.. But try not to change the basic structure of an indices and scores arrays. 你的建议很受欢迎改进..但尽量不要改变索引和分数数组的基本结构。

Note: I'm running within scala 2.10 REPL. 注意：我在scala 2.10 REPL中运行。

Answer 1

The problem is that scala.math.Ordering is not specialized. 问题是scala.math.Ordering不是专门的。 So every time you call the compare method with a primitive like Int , both arguments of type Int are being boxed to java.lang.Integer . 所以每次调用的是具有基本类似的比较方法的时间Int ，类型的两个参数Int被装箱为java.lang.Integer 。 That is producing a lot of short-lived objects, which slows things down considerably. 这会产生许多短暂的物体，这会大大减慢速度。

The spire library has a specialized version of Ordering called spire.algebra.Order that should be much faster. 尖顶库有一个专门版本的Ordering，名为spire.algebra.Order ，速度要快得多。 You could just try to substitute it in your code and run your benchmark again. 您可以尝试在代码中替换它并再次运行基准测试。

There are also sorting algorithms in spire. 尖顶中还有排序算法。 So maybe just try those. 所以也许只是试试那些。

Basically, whenever you want to do math with primitives in a high performance way, spire is the way to go. 基本上，无论何时你想以高性能的方式用基元进行数学运算，尖顶都是可行的方法。

Also, please use a proper microbenchmarking tool like Thyme or JMH for benchmarks if you want to trust the results. 此外，如果您想要信任结果，请使用适当的微基准测试工具（如Thyme或JMH）进行基准测试。

Scala quickSort使用Ordering [T]速度慢10倍

问题描述

1 个解决方案

解决方案1
6 已采纳 2016-02-10 08:09:02

Scala quickSort使用Ordering [T]速度慢10倍

问题描述

1 个解决方案

解决方案1 6 已采纳 2016-02-10 08:09:02

解决方案1
6 已采纳 2016-02-10 08:09:02