简体   繁体   中英

What are some good use cases of lazy evaluation in Scala?

When working with large collections, we usually hear the term "lazy evaluation". I want to better demonstrate the difference between strict and lazy evaluation, so I tried the following example - getting the first two even numbers from a list:

scala> var l = List(1, 47, 38, 53, 51, 67, 39, 46, 93, 54, 45, 33, 87)
l: List[Int] = List(1, 47, 38, 53, 51, 67, 39, 46, 93, 54, 45, 33, 87)

scala> l.filter(_ % 2 == 0).take(2)
res0: List[Int] = List(38, 46)

scala> l.toStream.filter(_ % 2 == 0).take(2)
res1: scala.collection.immutable.Stream[Int] = Stream(38, ?)

I noticed that when I'm using toStream , I'm getting Stream(38, ?) . What does the "?" mean here? Does this have something to do with lazy evaluation?

Also, what are some good example of lazy evaluation, when should I use it and why?

One benefit using lazy collections is to "save" memory, eg when mapping to large data structures. Consider this:

val r =(1 to 10000)
   .map(_ => Seq.fill(10000)(scala.util.Random.nextDouble))
   .map(_.sum)
   .sum

And using lazy evaluation:

val r =(1 to 10000).toStream
   .map(_ => Seq.fill(10000)(scala.util.Random.nextDouble))
   .map(_.sum)
   .sum

The first statement will genrate 10000 Seq s of size 10000 and keeps them in memory, while in the second case only one Seq at a time needs to exist in memory, therefore its much faster...

Another use-case is when only a part of the data is actually needed. I often use lazy collections together with take , takeWhile etc

Let's take a real life scenario - Instead of having a list, you have a big log file that you want to extract first 10 lines that contains "Success".

The straight forward solution would be reading the file line-by-line, and once you have a line that contains "Success", print it and continue to the next line.

But since we love functional programming, we don't want to use the traditional loops. Instead, we want to achieve our goal by composing functions.

First attempt:

Source.fromFile("log_file").getLines.toList.filter(_.contains("Success")).take(10)

Let's try to understand what actually happened here:

  1. we read the whole file

  2. filter relevant lines

  3. took the first 10 elements

If we try to print Source.fromFile("log_file").getLines.toList , we will get the whole file, which is obviously a waste, since not all lines are relevant for us.

Why we got all lines and only then we performed the filtering? That's because the List is a strict data structure, so when we call toList , it evaluates immediately , and only after having the whole data, the filtering is applied.

Luckily, Scala provides lazy data structures, and stream is one of them:

Source.fromFile("log_file").getLines.toStream.filter(_.contains("Success")).take(10)

In order to demonstrate the difference, let's try:

Source.fromFile("log_file").getLines.toStream

Now we get something like:

Scala.collection.immutable.Stream[Int] = Stream(That's the first line, ?)

toStream evaluates to only one element - the first line in the file. The next element is represented by a "?", which indicates that the stream has not evaluated the next element, and that's because toStream is lazy function , and the next item is evaluated only when used.

Now after we apply the filter function, it will start reading the next line until we get the first line that contains "Success":

> var res = Source.fromFile("log_file").getLines.toStream.filter(_.contains("Success"))
Scala.collection.immutable.Stream[Int] = Stream(First line contains Success!, ?)

Now we apply the take function. There is still no action is performed, but it knows that is should pick 10 lines, so it doesn't evaluate until we use the result:

res foreach println

Finally, i we now print res , we'll get a Stream containing the first 10 lines, as we expected.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM