[英]Why is filter in front of foldLeft slow in Scala?
I wrote an answer to the first Project Euler question: 我写了第一个Project Euler问题的答案:
Add all the natural numbers below one thousand that are multiples of 3 or 5.
添加1000以下的所有自然数,即3或5的倍数。
The first thing that came to me was: 我遇到的第一件事是:
(1 until 1000).filter(i => (i % 3 == 0 || i % 5 == 0)).foldLeft(0)(_ + _)
but it's slow (it takes 125 ms), so I rewrote it, simply thinking of 'another way' versus 'the faster way' 但它很慢(需要125毫秒),所以我重写了它,只是想到'另一种方式'而不是'更快的方式'
(1 until 1000).foldLeft(0){
(total, x) =>
x match {
case i if (i % 3 == 0 || i % 5 ==0) => i + total // Add
case _ => total //skip
}
}
This is much faster (only 2 ms). 这要快得多(只有2毫秒)。 Why?
为什么? I'm guess the second version uses only the Range generator and doesn't manifest a fully realized collection in any way, doing it all in one pass, both faster and with less memory.
我猜第二个版本只使用Range生成器,并没有以任何方式显示完全实现的集合,一次完成所有这一切,更快,内存更少。 Am I right?
我对吗?
Here the code on IdeOne: http://ideone.com/GbKlP 这里是IdeOne上的代码: http ://ideone.com/GbKlP
The problem, as others have said, is that filter
creates a new collection. 正如其他人所说,问题在于
filter
会创建一个新的集合。 The alternative withFilter
doesn't, but that doesn't have a foldLeft
. 替代
withFilter
没有,但是没有foldLeft
。 Also, using .view
, .iterator
or .toStream
would all avoid creating the new collection in various ways, but they are all slower here than the first method you used, which I thought somewhat strange at first. 另外,使用
.view
, .iterator
或.toStream
都可以避免以各种方式创建新集合,但它们比你使用的第一种方法都要慢,我起初觉得有点奇怪。
But, then... See, 1 until 1000
is a Range
, whose size is actually very small, because it doesn't store each element. 但是,那么......看,
1 until 1000
是一个Range
,其大小实际上非常小,因为它不存储每个元素。 Also, Range
's foreach
is extremely optimized, and is even specialized
, which is not the case of any of the other collections. 此外,
Range
的foreach
非常优化,甚至是specialized
,而不是任何其他系列的情况。 Since foldLeft
is implemented as a foreach
, as long as you stay with a Range
you get to enjoy its optimized methods. 由于
foldLeft
是作为foreach
实现的,只要您使用Range
您就可以享受其优化的方法。
(_: Range).foreach
: (_: Range).foreach
:
@inline final override def foreach[@specialized(Unit) U](f: Int => U) {
if (length > 0) {
val last = this.last
var i = start
while (i != last) {
f(i)
i += step
}
f(i)
}
}
(_: Range).view.foreach
def foreach[U](f: A => U): Unit =
iterator.foreach(f)
(_: Range).view.iterator
override def iterator: Iterator[A] = new Elements(0, length)
protected class Elements(start: Int, end: Int) extends BufferedIterator[A] with Serializable {
private var i = start
def hasNext: Boolean = i < end
def next: A =
if (i < end) {
val x = self(i)
i += 1
x
} else Iterator.empty.next
def head =
if (i < end) self(i) else Iterator.empty.next
/** $super
* '''Note:''' `drop` is overridden to enable fast searching in the middle of indexed sequences.
*/
override def drop(n: Int): Iterator[A] =
if (n > 0) new Elements(i + n, end) else this
/** $super
* '''Note:''' `take` is overridden to be symmetric to `drop`.
*/
override def take(n: Int): Iterator[A] =
if (n <= 0) Iterator.empty.buffered
else if (i + n < end) new Elements(i, i + n)
else this
}
(_: Range).view.iterator.foreach
def foreach[U](f: A => U) { while (hasNext) f(next()) }
And that, of course, doesn't even count the filter
between view
and foldLeft
: 当然,这甚至不计算
view
和foldLeft
之间的filter
:
override def filter(p: A => Boolean): This = newFiltered(p).asInstanceOf[This]
protected def newFiltered(p: A => Boolean): Transformed[A] = new Filtered { val pred = p }
trait Filtered extends Transformed[A] {
protected[this] val pred: A => Boolean
override def foreach[U](f: A => U) {
for (x <- self)
if (pred(x)) f(x)
}
override def stringPrefix = self.stringPrefix+"F"
}
Try making the collection lazy first, so 首先尝试使集合变得懒惰,所以
(1 until 1000).view.filter...
instead of 代替
(1 until 1000).filter...
That should avoid the cost of building an intermediate collection. 这应该避免建立中间集合的成本。 You might also get better performance from using
sum
instead of foldLeft(0)(_ + _)
, it's always possible that some collection type might have a more efficient way to sum numbers. 使用
sum
而不是foldLeft(0)(_ + _)
也可能获得更好的性能,总有可能某些集合类型可能有更有效的方法来对数字求和。 If not, it's still cleaner and more declarative... 如果没有,它仍然更清洁,更具声明性......
Looking through the code, it looks like filter
does build a new Seq on which the foldLeft
is called. 查看代码,看起来
filter
会构建一个新的Seq,在其上调用foldLeft
。 The second skips that bit. 第二个跳过那一点。 It's not so much the memory, although that can't but help, but that the filtered collection is never built at all.
它不是内存,虽然它不能不帮助,但过滤的集合根本就没有构建。 All that work is never done.
所有这些工作都没有完成。
Range uses TranversableLike.filter
, which looks like this: Range使用
TranversableLike.filter
,如下所示:
def filter(p: A => Boolean): Repr = {
val b = newBuilder
for (x <- this)
if (p(x)) b += x
b.result
}
I think it's the +=
on line 4 that's the difference. 我认为第4行的
+=
是差异。 Filtering in foldLeft
eliminates it. 在
foldLeft
过滤foldLeft
消除它。
filter
creates a whole new sequence on which then foldLeft
is called. filter
创建一个全新的序列,然后foldLeft
。 Try: 尝试:
(1 until 1000).view.filter(i => (i % 3 == 0 || i % 5 == 0)).reduceLeft(_+_)
This will prevent said effect, merely wrapping the original thing. 这将阻止所述效果,仅仅包裹原始物体。 Exchanging
foldLeft
with reduceLeft
is only cosmetic (in this case). 与
reduceLeft
交换foldLeft
只是化妆品(在这种情况下)。
Now the challenge is, can you think of a yet more efficient way? 现在面临的挑战是,您能想到一种更有效的方式吗? Not that your solution is too slow in this case, but how well does it scale?
在这种情况下,并不是说你的解决方案太慢了,但它的扩展程度如何? What if instead of 1000, it was 1000000000?
如果不是1000,那就是10亿? There is a solution that could compute the latter case just as quickly as the former.
有一种解决方案可以像前者一样快速地计算后一种情况。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.