What are some good use cases of lazy evaluation in Scala?
What are some good use cases of lazy evaluation in Scala?
When working with large collections, we usually hear the term "lazy evaluation". I want to better demonstrate the difference between strict and lazy evaluation, so I tried the following example - getting the first two even numbers from a list:
scala> var l = List(1, 47, 38, 53, 51, 67, 39, 46, 93, 54, 45, 33, 87)
l: List[Int] = List(1, 47, 38, 53, 51, 67, 39, 46, 93, 54, 45, 33, 87)
scala> l.filter(_ % 2 == 0).take(2)
res0: List[Int] = List(38, 46)
scala> l.toStream.filter(_ % 2 == 0).take(2)
res1: scala.collection.immutable.Stream[Int] = Stream(38, ?)
I noticed that when I'm using toStream
, I'm getting Stream(38, ?)
. What does the "?" mean here? Does this have something to do with lazy evaluation?
toStream
Stream(38, ?)
Also, what are some good example of lazy evaluation, when should I use it and why?
2 Answers
2
One benefit using lazy collections is to "save" memory, e.g. when mapping to large data structures. Consider this:
val r =(1 to 10000)
.map(_ => Seq.fill(10000)(scala.util.Random.nextDouble))
.map(_.sum)
.sum
And using lazy evaluation:
val r =(1 to 10000).toStream
.map(_ => Seq.fill(10000)(scala.util.Random.nextDouble))
.map(_.sum)
.sum
The first statement will genrate 10000 Seq
s of size 10000 and keeps them in memory, while in the second case only one Seq
at a time needs to exist in memory, therefore its much faster...
Seq
Seq
Another use-case is when only a part of the data is actually needed. I often use lazy collections together with take
, takeWhile
etc
take
takeWhile
Let's take a real life scenario - Instead of having a list, you have a big log file that you want to extract first 10 lines that contains "Success".
The straight forward solution would be reading the file line-by-line, and once you have a line that contains "Success", print it and continue to the next line.
But since we love functional programming, we don't want to use the traditional loops. Instead, we want to achieve our goal by composing functions.
First attempt:
Source.fromFile("log_file").getLines.toList.filter(_.contains("Success")).take(10)
Let's try to understand what actually happened here:
we read the whole file
filter relevant lines
took the first 10 elements
If we try to print Source.fromFile("log_file").getLines.toList
, we will get the whole file, which is obviously a waste, since not all lines are relevant for us.
Source.fromFile("log_file").getLines.toList
Why we got all lines and only then we performed the filtering? That's because the List is a strict data structure, so when we call toList
, it evaluates immediately, and only after having the whole data, the filtering is applied.
toList
Luckily, Scala provides lazy data structures, and stream is one of them:
Source.fromFile("log_file").getLines.toStream.filter(_.contains("Success")).take(10)
In order to demonstrate the difference, let's try:
Source.fromFile("log_file").getLines.toStream
Now we get something like:
Scala.collection.immutable.Stream[Int] = Stream(That's the first line, ?)
toStream
evaluates to only one element - the first line in the file. The next element is represented by a "?", which indicates that the stream has not evaluated the next element, and that's because toStream
is lazy function, and the next item is evaluated only when used.
toStream
toStream
Now after we apply the filter function, it will start reading the next line until we get the first line that contains "Success":
> var res = Source.fromFile("log_file").getLines.toStream.filter(_.contains("Success"))
Scala.collection.immutable.Stream[Int] = Stream(First line contains Success!, ?)
Now we apply the take
function. There is still no action is performed, but it knows that is should pick 10 lines, so it doesn't evaluate until we use the result:
take
res foreach println
Finally, i we now print res
, we'll get a Stream containing the first 10 lines, as we expected.
res
BufferedSource
Source.fromFile
close
Thanks for contributing an answer to Stack Overflow!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
One important thing with this specific example, however, is resource management; your
BufferedSource
(returned bySource.fromFile
) will need to beclose
d and you must be very careful that all relevant input is strictly evaluated before doing so– oxbow_lakes
Nov 24 '17 at 12:37